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Preface 



The tenth anniversary of the LOPSTR^ symposium provided the incentive for 
this volume. LOPSTR started in 1991 as a workshop on logic program synthesis 
and transformation, but later it broadened its scope to logic-based program 
development in general, that is, program development in computational logic, 
and hence the title of this volume. 

The motivating force behind LOPSTR has been the belief that declarative 
paradigms such as logic programming are better suited to program development 
tasks than traditional non-declarative ones such as the imperative paradigm. 
Specification, synthesis, transformation or specialization, analysis, debugging 
and verification can all be given logical foundations, thus providing a unifying 
framework for the whole development process. 

In the past 10 years or so, such a theoretical framework has indeed begun to 
emerge. Even tools have been implemented for analysis, verification and special- 
ization. 

However, it is fair to say that so far the focus has largely been on programming- 
in-the-small. So the future challenge is to apply or extend these techniques to 
programming-in-the-large, in order to tackle software engineering in the real 
world. 

Returning to this volume, our aim is to present a collection of papers that 
reflect significant research efforts over the past 10 years. These papers cover the 
whole development process: specification, synthesis, analysis, transformation and 
specialization, as well as semantics and systems. 

We would like to thank all the authors for their valuable contributions that 
made this volume possible. We also thank the reviewers for performing their 
arduous task meticulously and professionally: Annalisa Bossi, Nicoletta Cocco, 
Bart Demoen, Danny De Schreye, Yves Deville, Sandro Etalle, Pierre Flener, 
John Gallagher, Samir Genaim, Gopal Gupta, Ian Hayes, Patricia Hill, Andy 
King, Vitaly Lagoon, Michael Leuschel, Naomi Lindenstrauss, Nancy Mazur, 
Mario Ornaghi, Dino Pedreschi, Alberto Pettorossi, Maurizio Proietti, GR Ra- 
makrishnan, Sabina Rossi, Abhik Roychoudhury, Salvatore Ruggieri, Tom Schri- 
jvers, Alexander Serebrenik, Jan-Georg Smaus, Wim Vanhoof and Sofle Ver- 
baeten. 



April 2004 



Maurice Bruynooghe and Kung-Kiu Lau 



^ http:/ /www. cs.man.ac.uk/~kung-kiu/lopstr/ 
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Abstract. In order to provide a formalism for defining program cor- 
rectness and to reason about program development in Computational 
Logic, we believe that it is better to distinguish between specifications 
and programs. To this end, we have developed a general approach to 
specification that is based on a model-theoretic semantics. In our pre- 
vious work, we have shown how to define specifications and program 
correctness for open logic programs. In particular we have defined a no- 
tion of correctness called steadfastness, that captures at once modularity, 
reusability and correctness. In this paper, we review our past work and 
we show how it can be used to define compositional units that can be 
correctly reused in modular or component-based software development. 



1 Introduction 

In software engineering, requirements analysis, design and implementation are 
distinctly separate phases of the development process [18], as they employ dif- 
ferent methods and produce different artefacts. In requirements analysis and 
design, specifications play a central role, as a frame of reference capturing the 
requirements and the design decisions. By contrast, data and programs only ap- 
pear in the implementation phase, towards the end of the development process. 
There is therefore a clear distinction between specifications and programs. 

In Computational Logic, however, this distinction is usually not maintained. 
This is because there is a widely held view that logic programs are executable 
specifications and therefore there is no need to produce specifications before the 
implementation phase of the development process. We believe that undervalu- 
ing specifications in this manner is not an ideal platform for program devel- 
opment. If programs are indistinguishable from specifications, then how do we 
define program correctness, and how do we reason about program development? 
We hold the view that the meaning of correctness must be defined in terms of 
something other than logic programs themselves. We are not alone in this, see 
e.g., [17, p. 410]. In our view, the specification should axiomatise all our rele- 
vant knowledge of the problem context and the necessary data types, whereas. 
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for complexity reasons, programs rightly capture only what is strictly neces- 
sary for computing. In the process of extracting programs from specifications, 
a lot of knowledge is lost, making programs much weaker axiomatisations. This 
suggests that specifying and programming are different activities, involving dif- 
ferent methodological aspects. Thus, we take the view that specifications should 
be clearly distinguished from programs, especially for the purpose of program 
development. Indeed, we have shown (in [28,29]) that in Computational Logic, 
not only can we maintain this distinction, but we can also define various kinds 
of specifications for different purposes. Moreover, we can also define correctness 
with respect to these specifications. 

Our semantics for specification and correctness is model-theoretic. The declar- 
ative nature of such a semantics allows us to define steadfastness [34] , a notion of 
correctness that captures at once modularity, reusability and correctness. Open 
programs are incomplete pieces of code that can be (re)used in many different 
admissible situations, by closing them (by adding the missing code) in many 
different ways. Steadfastness of an open program P is pre-proved correctness of 
the various closures of P, with respect to the different meanings that the spec- 
ification of P assumes in the admissible situations. For correct reuse, we need 
to know when a situation is admissible. This knowledge is given by the prob- 
lem context. We have formalised problem context as a specification framework 
[27], namely, a first-order theory that axiomatises the problem context, charac- 
terises the admissible situations as its (intended) models, and is used to write 
specifications and to reason about them. 

In this paper, we review our work in specification and correctness of logic pro- 
grams, including steadfastness. Our purpose is to discuss the role of steadfastness 
for correct software development. In particular, we are interested in modularity 
and reuse, which are key aspects of software development. Our work is centred 
on the notion of a compositional unit. A compositional unit is a software com- 
ponent, which is commonly defined as a unit of composition with contractually 
specified interfaces and context dependencies only [46]. The interfaces declare 
the imported and exported operations, and the context dependencies specify the 
constraints that must be satisfied in order to correctly (re)use them. Through- 
out the paper, we will not refer to compositional units as software components, 
however, for the simple reason that as yet there is no standard definition for the 
latter (although the one we used above [46] is widely accepted). So we prefer 
to avoid any unnecessary confusion. In our compositional units, the interfaces 
and the context dependencies are declaratively specified in the context of the 
specification framework T axiomatising the problem context. F gives a precise 
semantics to specifications and allows us to reason about the correctness of pro- 
grams, as well as their correct reuse. Thus, in our formalisation, a compositional 
unit has a three-tier structure, with separate levels for framework, specifications 
and programs. 

We introduce compositional units in Section 2, and consider the three levels 
separately. We focus on model-theoretic semantics of frameworks and specifica- 
tions, and on steadfastness (i.e., open program correctness). 
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In Section 3, we show how the proposed formalisation of compositional units 
can be used to support correct reuse. Our aim is to highlight the aspects related 
to specifications, so we consider only the aspects related to the framework and 
the specification levels, while assuming the possibility of deriving (synthesising) 
steadfast programs from specifications. 

At the end of each section we briefly discuss and compare our results with 
related work, and finally in the conclusion we comment on future developments. 



2 Compositional Units 

In our approach, compositional units represent correctly reusable units of specifi- 
cations and correct open programs. Our view is that specifications and programs 
are not stand-alone entities, but are always to be considered in the light of 
a problem context. The latter plays a central role: it is the semantic context 
in which specifications and program correctness assume their appropriate mean- 
ing, and it contains the necessary knowledge for reasoning about correctness and 
correct reuse. This is reflected in the three-tier structure (with model-theoretic 
semantics) of a compositional unit, as illustrated in Figure 1. 



Compositional Unit K 



Framework ^f) 

Signature E 
Axioms X 
Theorems T 



Specifications 
Sp -^ ; . . . ; ; R.D \ ; ... 



Programs 

Pid\ ■ => . . . ; Pidh '■ ■S'SfclCji} 



Fig. 1. A three-tier formalism. 



At the top level of a compositional unit K, we have a specification framework 
T , or framework for short, that embodies an axiomatisation of the problem 
context. T has a signature A, a set X of axioms, a set T of theorems, a list 
Up oi open symbols, and a list Ap of defined symbols. The syntax Up ^ Ap 
indicates that the axioms of IF fix (the meaning of) the symbols Ap whenever IF 
is composed with frameworks that fnc Up. The defined and open symbols belong 
to the signature A, which may also contain closed symbols, namely symbols 
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defined completely by the axioms (i.e., independently from Up). Frameworks 
are explained in Section 2.1, and framework composition is explained in Section 
3.1. 

In the middle, we have the specification section. Its role is to bridge the gap 
between the framework T and the chosen programming language. So far, we have 
considered only logic programs, and the corresponding specification formalism 
is explained in Section 2.2. The specification section contains the specifications 
S'pj , . . ., S'p„ of the program predicates occurring in the program section. It may 
also contain a set of specification reduction theorems theorems RD \, . . . , RD^, 
that are useful to reason about correct reuse. Specification reduction is explained 
in Section 3.2. 

At the bottom, we have the program section. Programs are open logic (or con- 
straint logic) programs. An open program Pid^ : St^ Ssi {Ci} {I < i < h) has 
an identifier idi, an interface specification Ss^ and a set {Ci} of implemen- 

tation clauses. and Ssi are lists of specifications defined in the specification 
section. An interface specification contains all the information needed to correctly 
reuse a correct program. Programs and correctness are explained in Section 2.3. 
Correct reuse is explained in Section 3.3. 



2.1 Specification Frameworks 

A specification framework P is defined in the context of first-order logic, and 
contains the relevant knowledge of the necessary concepts and data types for 
building a model of the application at hand. 

We distinguish between closed and open frameworks. A closed framework 
F = (A,X, T) has a signature A, a set X of axioms, and a set T of theorems. 
It has no open and defined symbols, that is, all the symbols of A are closed. 

Example 1. An example of closed framework is first-order arithmetic MAT = 
{SNat,l^Nat,TNat), introduced by the following syntax:^ 

Framework MAT ; 

DECLS: Nat : sort] 

0 : [] ^ Nat] 

s : [Nat] Nat] 

: [Nat, Nat] — > Nat] 

AXS: Nat : construct{0, s : Nat)] 

-f : i -\- 1) = i] 

1 + s{j) = s{i + j)] 

: 1*0 = 0 ; 

i * s{j) = i * j i] 

THMS: i-k j = j -hi] 



® In all the examples, we will omit the outermost universal quantifiers, but their omni- 
presence should be implicitly understood. 
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The signature ^Nat, introduced in the declaration section decls, is the signature 
of Peano’s arithmetic. The axioms ^Nat, introduced in the AXS section, are the 
usual ones of first-order arithmetic. 0 and s are the constructors of Nat and their 
axioms, which we call the constructor axioms for Nat, are collectively indicated 
by construct{0,s : Nat). The latter contains Clark’s equality theory [35] for 0 
and s, as well as all the instances of the first-order induction schema. NAT has 
been widely studied, and there are a lot of known theorems (in section thms), 
including for example the associative, commutative and distributive laws. 

Theorems are an important part of a framework. However, they are not 
relevant in the definitions that follow, so we will not refer to them explicitly 
here. 

For closed frameworks we adopt isoinitial semantics, that is, we choose the 
intended model oi T = (-F, X) to be a reachable isoinitial model, defined as 
follows: 

Definition 1 (Reachable Isoinitial Model [5]). Let~K. be a set of S- axioms. 
A E -structure i is an isoinitial model o/X iff, for every model M o/X, there is 
a unique isomorphic embedding f : i ^ M. 

A model i is reachable if its elements can be represented by ground terms. 

Definition 2 (Adequate Closed Frameworks [30]). A closed framework 
T = (X, X) is adequate iff there is a reachable isoinitial model i of X that we 
call ‘the’ intended model of T. 

In fact I is one of many intended models of T, all of which are isomorphic. 
So I is unique up to isomorphism, and hence our (ab)use of ‘the’. 

As shown in [5], adequacy entails the computability of the operations and 
predicates of the signature. 

Example 2. NAT is an adequate closed framework. Its intended model is the 
standard structure N of natural numbers {N is a reachable isoinitial model of 
l^Nat)- N interprets Nat as the set of natural numbers, and s, -I- and * as the 
successor, sum and product function, respectively. 

The adequacy of a closed framework is not a decidable property. We have the 
following useful proof-theoretic characterisation, which can be seen as a “richness 
requirement” implicit in isoinitial semantics [31]: 

Definition 3 (Atomic Completeness). A framework T = (X,X) is atomi- 
cally complete iff, for every ground atomic formula A, either X h A or X I A. 

Theorem 1 (Adequacy Condition [38]). A closed framework T = (X,X) 
is adequate iff it has at least one reachable model and is atomically complete. 

Closed adequate frameworks can be built incrementally, starting from a 
closed adequate kernel, by means of adequate extensions. 
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Definition 4 (Adequate Extensions [30]). An adequate extension of an ad- 
equate closed framework T = (A, X) is an adequate closed framework Ts = 
( X U i5, X U Dg) such that: 

a) Dg is a set of (S U S)-axioms, axiomatising a set of new (i.e., not in S) 

symbols S; 

b) the S -reduct i\S of the intended model i of !Fg is the intended model of T. 

The notions of reduct and expansion are standard in logic [4]. The X-reduct 
i' = i|X forgets the interpretation of the symbols not in X, in our case the 
new symbols 5. Conversely, i is said to be a (X U 6)-expansion of i', that is, a 
(Xu (i)-expansion is a (X U ^(-interpretation that preserves the meaning of the 
old X-symbols, and interprets the new S arbitrarily. 

In Definition 4, by b), the intended model i of an adequate extension is an 
expansion of the old intended model, that is, adequacy entails that the meaning 
of the old symbols is preserved. 

If the axioms Dg of an adequate extension are explicit definitions, we say that 
they are adequate explicit definitions. Since they are important in our approach, 
we briefly recall them. 

An explicit definition of a new relation r has the form Wx^r^x) R{x), 
where x indicates a tuple of variables and (as usual) extends the scope 
of a quantifier to the longest subformula next to it. The explicit definition of 
a new function / has the form yx,F{x,f{x)), where R{x) and F{x,y) are 
formulas of the framework that contain free only the indicated variables. The 
explicit definition of / has the proof obligation X h Vx. 3!y, F{x,y), where X 
are the framework axioms (as usual, 3\y means unique existence). R(x) is called 
the definens (or defining formula) of r, and F{x,y) the definiens (or defining 
formula) of /. 

Explicit definitions have nice properties. They are purely declarative, in the 
following sense: they define the new symbols purely in terms of the old ones, 
that is, in a non-recursive way. This declarative character is reflected by the 
following eliminability property, where X is the signature of the framework and 
5 are the new explicitly defined symbols: the extension is conservative (i.e., no 
new X-theorem is added) and every formula of X -|- <5 is provably equivalent to 
a corresponding formula of the old signature X. Moreover, if we start from a 
sufficiently expressive kernel, most of the relevant relations and functions can be 
explicitly defined. Finally, we can prove: 

Proposition 1. If the definiens of an explicit definition is quantifier- free, then 
the definition is adequate. 

If the definiens is not quantifier-free, adequacy must be checked. To state the 
adequacy of closed frameworks and of explicit definitions, we can apply proof 
methods based on logic program synthesis [26,27] or constructive logic [38]. 

Example 3. The kernel MAT of Example 1 is sufficiently expressive in the fol- 
lowing sense. Every recursively enumerable relation r can be introduced by an 




Specifying Compositional Units for Correct Program Development 7 

explicit definition.^ For example, we can define the ordering relations < and < 
by the explicit definitions: 

D< : i < j ^ 3k + k = j; 

D< ■ i < j ^ i < j /\ = j- 

Since the outermost universal quantifiers are implicitly present, D< is the closed 
formula Vz, j ti<j^3k,i + k = j (similarly, is understood to be univer- 
sally closed). 

Since the definiens , z -I- fc = j of < is quantified, adequacy of D< must be 
checked. It can be proved by logic program synthesis, as follows. 

(a) We derive the following clauses in MAT + D< : 

P< : 0 < z ^ 

s{i) < s{j) ^i<j. 

(b) In MAT + D< we prove the only-if part of the completed definition [35] of 
< in P< (the if part is guaranteed by a)). 

(c) Finally, we prove that P< existentially terminates, i.e., for every ground atom 
A, the goal ^ A finitely fails or has at least one successful derivation (with 
program P<). 

By (a), (b) and (c) we get ([27], Theorem 11) that the extension by D< is 
adequate. By the way, adequacy entails that the new predicate < is computable. 
We do not have to check the adequacy of Z?<, because its definiens is quantifier 
free. Z?< uses <. However, an explicit definition of < and a proof of its adequacy 
can be given directly in MAT , by the eliminability of explicit definitions. Thus 
we could define < first, prove its adequacy, and then define < on top of <. That 
is, the order of explicit definitions is not relevant. 

We can explicitly define functions, for example the integer square root sqrt\ 

Dsqrt ■ sqrt{i) * sqrt{i) < z A z < s{sqrt{i)) * s{sqrt{i)). 

The proof obligation Vz . 3\j ,j*j<zAz< s{j) * s(j) can be proved in MAT 
by induction. Adequacy follows from the fact that the definiens j*j<iAi< 
s{j) * s(j) is quantifier free. 

An open framework T{II Z\) = (A'jX) represents an incomplete axioma- 
tisation. It has a non-empty import list iT, containing the symbols left open by 
the axioms, and a (possibly empty) disjoint export list A, containing the symbols 
that are defined by the axioms, in terms of the open ones. The closed symbols 
are the symbols of the signature that are not in 77 U A, and their meaning is 
fixed in a unique way by the axioms. We distinguish three sets of axioms, where 
Sk is the sub-signature of the closed symbols: 



^ Every recursively enumerable relation is Diophantine (Matijacevic theorem [37]). 
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— the kernel axioms = XjXx (. . . \ Ek is the subset of the axioms with 
symbols from X^); the kernel axioms axiomatise the closed symbols, that is, 
Tk = {Ek,^k) must be an adequate closed framework, that we call the 
closed kernel] 

— the constraints Xc = (X|(Xx U II)) \ X/f, which constrain the possible 
interpretations of the open symbols 77; 

— and the definition axioms X/5 = X \ (X^^ U Xc), which fix the meaning of 
the defined symbols A, in terms of the open and closed symbols. 



Example 4- The following open framework axiomatises lists with generic ele- 
ments X and a generic total ordering <l on X. From now on, in the examples, 
the variables of sort X will begin with x, y, z, w, those of sort Nat with i,j, h, k, 
and those of sort ListX with I, m, n, o. 

Framework CIST {X,<\ ListX, nil,.,@, nocc ) ; 

KERNEL: NAT ; 

DECLS: X : sort] 

ListX : sort] 

[X,X]; 

nil : [ ] ^ ListX] 

: [X, ListX] ListX] 

: [X, Nat, ListX]] 
nocc : [X, ListX] Nat] 

DEFAXS: ListX : construct{nil , . : ListX)] 

@ : x@(0, 1) ^ 3y, m,l = y.m Ax = y] 

x@(s(i), 1) ^ 3y,m, I = y.m A x@(i, to); 
nocc : nocc{x, nil) = 0; 

x = y ^ nocc{x, y.l) = nocc{x, 1) + 1; 

~^x = y ^ nocc{x, y.l) = nocc{x, 1)] 

CONSTRS: <1 : TotalOrdering{<\). 

The signature X Nat , the axioms X Nat and the theorems T Nat of the imported ker- 
nel NAT are automatically included. In the definition axioms defaxs, nil and 
are the list constructors, as indicated by construct{nil, . : ListX), which con- 
tains Clark’s equality theory and structural induction on constructors; x@{i,l) 
means that the element x occurs at position i in the list I, where positions start 
from 0; nocc{x, 1) is the number of occurrences of the element x in the list 1] by 
the constraint axioms CONSTRS, <1 is a total ordering relation. 

To specify the basic operations on the ADT of lists, the closed kernel NAT 
is not necessary. We have imported it for specification and reasoning purposes. 
Indeed, by using natural numbers we can introduce @ and nocc. The resulting 
language and axiomatic system give a rich starting framework, which allows us 
to explicitly define the usual operations on lists and ordered lists, and to reason 
about them (see Example 6). 
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An open framework T has a class of not necessarily isomorphic intended 
models, since U allows many [Sk U 77)-interpretations, that we call pre- 
models. The semantics considered here is a variant of the one presented in [30]. 
A pre-model is an expansion of the intended model of the kernel that satisfies 
the constraints X(y. For every pre-model P, the axioms of IF fix a corresponding 
intended p-model ip, defined as follows. 

A p-model of IF is a A-model M of X such that m\{Ek U 77) = p, that is, M 
coincides with p over the closed and open symbols. Since 77 may contain open 
sorts, we consider 77-reachable models, where 77-reachability is reachability in 
an expansion containing a new constant for each element of each open sort. 

Definition 5 (p-isoinitial Models). A p-model i is a p-isoinitial model of T 
ijf, for every P-model M, there is a unique isomorphic embedding 7 : i ^ M such 
that i is the identity over the open sorts. 

Definition 6 (Adequate Open Frameworks and Intended Models). An 

open framework T is adequate iff, for every pre-model P of T , there is a II- 
reachable P-isoinitial model ip, that we call the intended P-model of T . 

M is an intended model of T iff there is a pre-model P of T such that M is 
the intended P-model of T . 

For every pre-model P, the intended p-model is unique up to isomorphism. 
Intended models with non-isomorphic pre- models are, of course, non-isomorphic. 
We consider closed frameworks as a limiting case, where the kernel coincides with 
the whole framework and the unique intended model coincides with the unique 
pre-model. 

Example 5. CTST is an adequate open framework. In it, a pre-model P coincides 
with Af for the kernel signature Xjvat and interprets X as any set with a total 
ordering <\. The intended p-model of C2ST interprets ListX as the set of the 
finite lists with elements from X, and the other defined symbols in the way 
already explained in Example 4. 

Adequate open frameworks can be built incrementally, by adequate exten- 
sions, where the intended models of an adequate extension T' of a framework 
T are expansions of intended models of T . 

Definition 7 (Adequate Extensions). A framework T' is an adequate ex- 
tension of an adequate open framework T iff T' is an adequate open or closed 
framework, the signature and the axioms of T' contain those of T , the kernel 
signature of T' contains the kernel signature of T , and for every intended model 
i' of T' , the reduct i'\E is an intended model of T. 

In the limiting case, an adequate extension T' of an open framework T may 
be a closed framework. In this case, we say that T' is an instance of T, and 
the axioms that “instantiate” (i.e., close) the open symbols are called closure 
axioms. A set of closure axioms is called a closure. Closures will be considered 
in Section 3.1, together with other framework operations. 
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In general, the adequacy of an extension is not decidable, but we may have 
different kinds of extensions, with different adequacy conditions. In particular, 
we distinguish: 

— Parameter extensions. In this case new parameters and/or new constraints 
are added. Parameter extensions are adequate iff, adding new constraints, 
consistency is preserved. 

— Defined symbol extensions. In this case new defined symbols, together with 
the corresponding definition axioms, are added. Adequate explicit definitions 
are still useful for introducing new defined symbols, and adequacy can be 
stated in a way similar to those mentioned before for closed framework ex- 
tensions (by program synthesis or constructive logic [27,38]). Proposition 1 
still holds. 

— Kernel extensions. In this case the closed kernel is extended by new closed 
symbols, as already shown for closed frameworks. 

Example 6. The framework £X5T(A, < ListX,nil, .,@,nocc) can be ob- 
tained by extending the framework CIST (X ListX, nil , ., @, nocc), without 

<1 and without constraint axioms, by the parameter < : [A, A] constrained by 
TotalOrdering{<\) . The kernel NAT can be extended by explicitly defining the 
most useful operations and predicates on natural numbers. The defined symbols 
can be extended by the relevant operations on lists, by means of explicit defi- 
nitions. For example, the definitions of list membership, length, concatenation 
and permutation are:® 

Dg : X € I ^ nocc{x, 1) > 0 
Dien '■ Vz . {3x . x@(i, 1)) ^ i < len{l) 

I?l :yi,x , {i < len{l) ^ (x@{i,l) ^ x@{i,l\m))) A 

(len{l) < i ^ {x@{i, m) ^ x@{i + len{l) , l\m))) 

Dperm '■ perm{l, m) ^ Va; , nocc{x, 1) = nocc{x, m) 

Dg gives rise to an adequate extension, because its definiens is quantifier free. 
The definiens of Dien is Vz , (3a: , a;@(z, 1)) ^ i < k and the proof obligation re- 
quires a proof of V? , 3!fc , Vz , (3a: . a:@(z, 1)) ^ i < k. Since the definiens is quan- 
tified, adequacy must be checked (and can be proved), by constructive proofs or 
by program synthesis. Adequacy must be checked (and can be proved) also for 
D| and 

Using < : [A, A], we can also define operations on ordered lists, like I <\Lrn 
(lexicographic ordering on lists), ord{l) {I is an ordered list), and so on. Their 
properties can be proved using the total ordering constraints. For example, we 
can prove that the lexicographic ordering <1^ is, in turn, a total ordering. 

2.2 Specifications 

In a compositional unit K, specifications assume their proper meaning only in 
the context of the framework T . In this section we define formally what we 

In and D|, the nniversal quantifiers of the definiens have not been omitted. 
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mean by specifications in T and we show some examples. We maintain a strict 
distinction between specification frameworks and (program) specifications and, 
to distinguish the function and relation symbols of the framework from those 
computed by programs, the latter will be called (program) operations. 

Definition 8 (Specifications and S-expansions). Let T{II Z\) = (^,X) 
he a framework. A specification S^j in (the context of) if is a set of closed 
{S + to) -formulas, that define a set of operations to in terms of T. 

An S'o;-expansion of a model M of T is a {S -\- iw)- expansion m' of M such 
that m' 1= S'!,;. 

That is, can be interpreted as an expansion operator that associates with 
every intended model of T the corresponding S'^^-expansions, namely the expan- 
sions that interpret the specified operations according to 

Definition 9 (Strict Specifications). A specification S^j is strict in a frame- 
work T, if, for every model M of T , there is only one S^j- expansion. It is non- 
strict otherwise. 

Now we list different kinds of strict and non-strict specifications considered 
in [28], essentially based on explicit definitions. The specification formalism con- 
sidered here is tailored to logic programs with definite clauses in a many-sorted 
signature. Program semantics is based on minimum Herhrand models, where 
program data (those used in programs) coincide with ground terms. We as- 
sume that the signature Sd of program data is pre-defined by the framework 
IF, and that, for every closed or defined sort s of So, hF contains the axioms 
construct{c\, . . . ,Cn ■ s), where ci, . . . , c„ are the constructors of s. They are the 
unique operations of sort s that can be used in logic programs. This assump- 
tion concerns Herbrand models of standard logic programs, where construct {. . .) 
holds, but our treatment readily extends to the specification formalism for con- 
straint logic programs by assuming that Sd is the constraint signature, and is 
pre-defined by the framework. 

Since in logic programs only program predicates are not pre-defined, we have 
to specify only them. There are different forms of specifications. 



If-and-Only-if Specifications. An if-and-only-if specification in a framework 
F is an explicit definition of a new predicate r: 

Sr '. Vx , r(^) ^ R{x) 

By the well known properties of explicit definitions, for every model M of the 
framework F, there is only one S'^-expansion of M, that is, Sr is strict. 

Example 1. In NAT we can specify, for example, the following predicates: 

Sdiv '■ div{i,j, h,k)^i = j*h + kAk<j; 

Sdivides : divides{i,j) ^ 3h , div{j,i, h,0); 

Sprime ■ prime{i) . divides{j, i) ^ j = 1\J j = v, 
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Super-and-sub Specifications. A super- and- sub specification in a framework 
T is of the form 

^ ^(^)) (^( 21 ) ^ Rsuper(^2i}} 

where Rsub{x) and Rsuper{x) are two formulas of T such that T h Vx. Rsub{x) 

Rsuper (^) ■ 

The implication \/x, Rsub{x) Rsuper (x) is satisfied by the models of T. 
Therefore, in every intended model i, the relation R^uh in b i-e., the set of values 
X such that i ^ i?s„h(x), is a sub-relation of the relation Rsuper, and the specified 
relation r is any relation that is a super-relation of Rsub but is a sub-relation of 

Rsuper ■ 

Conditional Specifications. A conditional specification of a new relation r in 
a framework IF has the form: 

yx,y,IC\x)^{r{x,y)^R{x,y)) (1) 

where IC{x) is the input condition, and R{x,y) is the input-output condition. 
Both IC{x) and R{x,y) are formulas of T. (1) specifies r{x,y) only when the 
input condition IC{x) is true, while nothing is required if the input condition 
is false. That is, IC{x) states that r{x,y) is to be called only in contexts that 
make it true. This fact allows us to assume IC{x) when reasoning about correct 
reuse, as shown in Section 3.2. 

(1) is equivalent to the following super-and-sub specification, which allows 
us to apply the results of [34] in correctness proofs: 

Vx,y, {IC{^ A R{x,]^ r{x,y)) A {r{x,y) ~^IC{x) V R{x,y)) 

Examples. In the open framework L2S'T{X,<] ^ ListX,nil, .,@,nocc), we 
have for example the following specification: 

Ssort '■ sort{l,m) ^ perm{l,m) A ord{m); 

Smerge '■ ord{l) A ord{m) {merge{l,m, o) ^ ord{o) A perm{l\m, o)); 

S split '■ {len{l) > 1 A splitlf, m, n) perm{l, mjn) A len{m) < len{l)A 
len{n) < len{l)) A {len{l) > 1 ^ 3m,n, split{l,m,n)); 

Ssort is an if-and-only-if specification, Smerge is a conditional specification. By 
the input condition, merge{l, m) is to be called only in contexts where the input 
lists are ordered. If they are not, o is not required to be ordered. Ssput is an ex- 
ample of another form of non-strict specification (called a selector specification) 
that we do not discuss here (see [28]). 

2.3 Interface Specifications, Programs, and Correctness 

Here we consider correctness of (logic) programs with respect to interface spec- 
ifications. In Section 3.2, we will consider the role of interface specifications in 
correct reuse. We start by introducing some terminology. 
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The signature Up of a, program P contains the declarations of its predicate, 
constant and function symbols, and the sorts occurring in such declarations. 
The data signature of P is the subsignature of its sort, constant and function 
symbols. According to the previous section, the data signature belongs to the 
framework signature. We will distinguish open and closed programs, as follows. 

The defined predicates of a program P are those that occur in the head of at 
least one clause of P, while the (possible) open predicates of P are those that 
occur only in the body. A program P is open if its signature contains at least one 
open sort or predicate. It is closed if no open symbol belongs to its signature. 

A interface specification for an open program P is of the form ^ Ss, 
where S'^ are specifications of a set tt of predicates that includes all the open 
predicates of P, and Ss are specifications of a set 6 of predicates that are included 
in the defined predicates of P. We will write P : St^ ^ Ss to indicate that P has 
specification S^ ^ Ss- If P has no open predicates, then S^ will be empty. In 
this case, we write P : ^ Ss- 

Example 9. In a compositional unit with open framework CXSP , we can declare 
the following open sorting program (where Ssput , S merge and Ssort are as shown 
in Example 8): 



Program : S spin i S merge ^ Ssort 

{ 

sort{nil, nil) ^ 
sort{x.nil,x.nil) ^ 

sort{x.y.l,o) ^ split{x.y.l),m,n), 

sort{m, mi), sort{n, ni), 
merge{mi,ni, o). 

} 

Programs may be open independently from the framework, i.e., closed frame- 
works may contain open programs. For example, in the closed framework AC4T, 
we can declare: 



Program Pprod ■ Ssum Sprod 

{ 

prod{i,0,0) ^ 

prod{i, s{j),h) ^ prod{i, j,k), sum{k,i,h). } 

{ 



where: 

Ssum : sum(x, y,z) ^ z = x + y; 

Sprod : prod{x, y,z) ^ z = x ■ y. 

Now we can define program correctness. We will first explain the correctness 
of closed programs in closed frameworks, because it is simpler and more intuitive. 
Then we introduce correctness of open programs. 




14 



Kung-Kiu Lau and Mario Ornaghi 



Correctness of Closed Programs. A closed program P has only defined 
predicates, and an interface specification of P is of the form Ss- For simplic- 
ity, we will consider the case of interface specifications Sr with one defined 
predicate r (the extension to 5'^ , . . . , 5'^*, , with k > 1, is immediate). We 
define program correctness in a closed framework as follows: 

Definition 10 (Correctness of Closed Programs). Let P he a closed frame- 
work with intended model i. Let Sr be a specification of a predicate r, P he a 
program that computes r, and H be the minimum Herhrand of P. P is correct 
with respect to the interface specification ^ Sr iff the interpretation of r in a 
coincides with the interpretation of r in one of the Sr-expansions ofi. 

For conciseness, we will say that P : ^ Sr is correct, to indicate that P is 
correct with respect to 5'^. 

For a strict specification Sr, there is only one S'^-expansion of i, that is, the 
new symbol r defined by Sr has a unique interpretation in i, and one in H. 
Correctness of P : ^ Sr means that the two interpretations of r coincide, or, at 
least, are isomorphic. This is illustrated in Figure 2. 




Fig. 2. Strict specifications. 



If Sr is not strict, then r has many interpretations with respect to i. Correct- 
ness of P : in this case means that the interpretation of r in H coincides 

with one of the interpretations of r with respect to i. This is illustrated in Fig- 
ure 3. 




Fig. 3. Non-strict specifications. 
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Steadfastness: Correctness of Open Programs. Now we consider open pro- 
grams in open frameworks, and we discuss the associated notion of correctness. 
An open program is correct if it behaves as expected in all the circumstances. 
We called this property steadfastness [34]. 

The correctness relation between an interface specification 3,^ =1* Ss and an 
open program P cannot be defined as in Definition 10, because open frameworks 
may have many intended models and we cannot use minimum Herbrand models 
as the semantics of open programs, since in the minimum Herbrand models, 
open relations are assumed to be empty, and therefore cannot play the role 
of parameters. So in [34] we introduced minimum 3-models, together with the 
notion of steadfastness , to serve as the basis for a model and proof theory of the 
correctness of open programs. 

Here we first recall the definition of steadfastness informally, and then define 
correctness of open programs, and give its relevant properties. A pre-signature 
for an open program P is a signature 17 that contains the data signature and the 
open predicates of P, but not the defined predicates of P. A pre-interpretation 
in f2 is an 17-interpretation. That is, symbols of 17 are considered to be open, 
and a pre-interpretation J interprets them arbitrarily. In contrast, the intended 
meaning of the defined predicates of P is stated by its clauses, in terms of J. 

Let P be a program with defined predicates 6. To define the intended mean- 
ing of (5 in a pre-interpretation j, we introduce J-models. A j-model of P is a 
model M of P, such that M | 17 = j, i.e., M coincides with j over 17. Since two 
distinct J-models M and N differ only for the interpretation of <5, we can compare 
them by looking at S: we say that M is contained in N, written M C ,5 n, iff the 
interpretation of (the predicates of) 5 in M is contained in that of <5 in N. We 
can show that a minimum j-model (with respect to C, 5 ) exists. The minimum 
J-model of P will be indicated by J^, and it represents the interpretation of S 
stated by the program P, in the pre-interpretation j. 

Using minimum J-models, steadfastness in an interpretation can be defined 
as follows. 

Definition 11 (Steadfastness). Let P be an open program, f2 he a pre-signa- 
ture for P, and r be a predicate defined by P. P is steadfast for r in a (Q r)- 
interpretation i if and only if the interpretation of r in its minimum 1 1 -model 
coincides with the interpretation of r in i. 

More intuitively, steadfastness in i for r means that the interpretation of r 
in I coincides with the interpretation of r stated by P, when the open symbols 
17 are interpreted as in i (i.e., when the pre-interpretation is i|l7). Consider for 
example the open program P in the context of MAT : 

r{x) ^ p{z, x) 

where x and z are of sort Nat. S^at U {p : [Nat, Nat]} is a pre-signature for this 
program. Consider the interpretation u where pNat is interpreted as in MAT, 
r{x) means “a; is even” , and p{z, x) means “z-\- z = x” . If we interpret p as in ii, 
we can easily see that the interpretation of r{x) in the corresponding minimum 
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model of P coincides with the interpretation of r(x) in ii, i.e., P is steadfast 
in ii. Similarly, if we consider I 2 that interprets p{z,x) as “z * z = x” , to get 
steadfastness I 2 has to interpret r{x) as “a; is a perfect square^' . 

Correctness is steadfastness in the expansions of the intended models stated 
by the interface specification: 

Definition 12 (Correctness of Open Programs). Let T be an open frame- 
work, and P : ^ Ss be an open program. P : St^ ^ Ss is correct in P 

iff for every intended model i of T and every STr-expansion of i, there is a 
Ss-expansion s ofi^, such that P is steadfast in s for the predicate symbols 
of6. 

The intuitive meaning of the previous definition is the following: if + tt is a 
pre-signature for P, and is a pre-interpretation that interprets the data signa- 
ture according to T and the open symbols according to 5,^ i-e., represents a 
legal parameter passing. Steadfastness of P in i,r,(5 means that the interpretation 
of 6 stated by P for the parameter passing is correct with respect to Ss- 

The following important properties of correct reusability hold (see [34]): 

Proposition 2 (Inheritance). Let P' be an adequate extension of an adequate 
(open) framework P. Lf P : ^ Ss is correct in P, then it is correct in P' . 

As we will show in Section 3, framework composition can be treated in 
terms of extension. Therefore inheritance yields a first level of correct reusability, 
namely reusability of correct programs through framework composition, exten- 
sion and instantiation. This level of correct reusability would not be important, 
however, if we could not guarantee the correctness of the composition of the 
inherited open programs. This second, important level of correct reusability will 
be called compositionality . In compositionality, interface specifications play a 
central role, as shown by the following theorems: 

Theorem 2 (Compositionality). Lf P : Ss^ and Q : Stt 2 Ss 2 

are correct in a framework P and are not mutually recursive, then P \J Q \ 
Stti, Stt 2 ‘S'ii, Ss 2 is correct in P . 

As we can see, interface specifications indicate how programs can be com- 
posed to correctly interact. Theorem 2 can be extended to mutually recursive 
programs, but in this case we have to check that open termination [34] is pre- 
served. By inheritance and compositionality we get reusability, as shown by the 
following example. 

Example 1 0. We can show that the open program P : Sgpiit , S merge Ssort in 

Example 9 is correct in the open framework CXSP . This means that, in every 
instance of LTSP , P : Ssput, S merge Ssort is always correct with respect 

to the specification Ssort, provided that it is composed with closed correct 

programs Q merge '■ ^ Smerge and Q split • ^ S split- 

This example shows that compositionality corresponds to a priori correctness 
of open programs in a framework. It thus corresponds to correctness of open 
modules in a library. It is to be contrasted with a posteriori correctness, i.e., 
correctness established by verification after program composition. 
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2.4 Related Work 

Specification frameworks are similar to Abstract Data Types (ADT’s). ADT’s 
became popular in the 80’s and have been widely studied [47]. In general, they 
are based on the initial algebra approach, that is, intended models are initial 
models. Parametric ADT’s have also been studied. These are similar to our 
open frameworks, even though they are technically defined in a different way. 
A detailed treatment of algebraic ADT’s, including the parametric case, can be 
found, for example, in [13]. 

The initial algebra approach is adequate for ADT specification, where the 
purpose is to give the minimal signature and axioms that are needed to char- 
acterise the desired data and operations. Initial models generalise the idea of 
minimum Herbrand models, and always exist for algebraic ADT’s and consis- 
tent Horn theories [21]. The existence of an initial model allows us to axiomatise 
only positive knowledge and to use (consistently) negation as failure: a fact is 
false if we do not have evidence of its truth. This allows for very compact ax- 
iomatisations. 

In contrast, our purpose is “knowledge representation” , that is, we are looking 
for an expressive signature and a rich set of axioms, to obtain a framework that 
represents our overall knowledge of a problem domain and allows us to reason 
about it. Isoinitial semantics requires stronger axiomatisations, and better meets 
our “richness requirement” , compared to initial semantics. It was introduced in 
[5] , with the purpose of giving a model-theoretic characterisation of computable 
ADT’s. 

Finally, our approach is different from the algebraic approach in the three- 
level architecture of our compositional units, and in the role that frameworks 
play in it. In this regard, we are closer to the two-tiered specification style of 
Larch [20], where specifications have two components: the first one is written 
in the Larch Shared Language LSL, and the second one in a Larch Interface 
Language, which is oriented to the programming language and is used to specify 
the interfaces between program components, i.e., the way they communicate. 

We consider non-recursive definitions, like explicit or conditional definitions, 
to be an important tool for both extending frameworks and specifying programs. 
In this regard, our work is similar to [36]. At the program and specification levels, 
our approach is in the tradition of logic program synthesis and correctness. Our 
notion of correctness for closed programs is similar to the one introduced in 
[22]. Correctness of open programs with respect to specifications similar to our 
interface specifications is considered in [10]. A conditional specification is like a 
pre-post-condition style of specification as in VDM [24], Z [45], and B [1], except 
that it is declarative. Declarative conditional specifications for logic programs 
were introduced in [8]. 

3 Operations on Compositional Units 

In this section we consider compositional units as building blocks for program 
development, that is, we focus on operations on compositional units that al- 
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low their correct reuse in the process of developing an application. This process 
starts from pre-existing compositional units, and iteratively extends them either 
directly, by inserting domain specific knowledge, or by reusing, i.e., incorporat- 
ing, other compositional units. Reuse can, in turn, be factorised into composition 
and extension, which are the basic operations considered in this section. Such 
operations involve the framework, the specification and the program levels. We 
will consider the different levels separately. 

3.1 Framework Reuse 

To compose two units C\ and C2, first we compose and/or extend their frame- 
works and T2 into a common extension T , and then use the specifications 
and programs in this richer T . Thus, in general, the framework level is the first 
level to be involved in operations on compositional units. Here we consider the 
basic operations needed at this level. 

Framework Morphisms. Operations like renaming or identifying different 
symbols may be needed for framework reuse. This kind of operation is formalised 
by framework morphisms. Before introducing framework morphisms, we briefly 
recall signature and theory morphisms [19,13]. 

A signature morphism ^ A2 is a map from the symbols of Ei to those 

of E2 that preserves the declarations. E2 extends in the following sense: 

— S2 contains the ^-image of Si] 

— every Ai-formula F translates into a A'2-formula tJ-(F); 

— instead of Ai-reducts we have ^-reducts: the /r-reduct of a i72-interpretation 
M is the Ai-interpretation M|/r that interprets every symbol cr of Si as M 
interprets the image /i(cr). 

A theory morphism p, : (Si,'Ki) (A2,X2) is a signature morphism p : Si ^ 
S2 such that p(X.i)* C XJ, where * denotes the proof-theoretic closure, p works 
as a generalised extension, in the sense that S2 contains (the ^-image of) Si 
and the theorems of X2 contain (the /x-translation of) those of Xi. 

Framework morphisms are defined as follows: 

Definition 13 (Framework Morphism). Let F{II => A) = (X, X) and 

F'{n' A') = {s' , X') be two frameworks. A framework morphism e : IF — > IF' 

is a theory morphism e : (X, X) ^ (X',X') such that the kernel signature of F' 
contains the e-image of the kernel signature of T . 

F' can be considered as a generalised extension of F . We say that it is the 
extension generated by the morphism e \ F ^ F' . Let F be adequate. We say 
that F' is an adequate extension of F if F' is adequate and, for every intended 
model i' of F' , the e-reduct I'je is an intended model of F . 

Framework extensions considered in the previous section, that simply in- 
troduce new symbols and axioms, are a particular case. They correspond to 
inclusion morphisms e that map each symbol a into a itself. 
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Example 11. A (generalised) adequate extension of CIST is: 

Framework CIST i{<\ ^ ListNat, nil, nocc) 

EXTENDS CIST ; 

CLOSE: A BY Nat] 

RENAME: ListX BY ListNat] 

It is generated by the morphism e defined by the clauses CLOSE and rename, e 
maps the sort symbol X into Nat, the sort symbol ListX into ListNat, and leaves 
the other symbols unchanged. Of course, arities and sorts in relation, function 
and constant declarations are translated by replacing X and ListX by Nat and 
ListNat. For example, now we have nil : [] ^ ListNat and . : [Nat, ListNat] —>■ 
ListNat. 

In CIST X, ListNat is closed (its intended meaning is the set of finite lists of 
natural numbers) and the only open symbol is < : [Nat, Nat]. A closed adequate 
extension can be obtained by closing <l by 

: X <\ y ^ X < y 

In this case, we have simply added the new axiom D^, that is, we have an 
inclusion morphism of CIST \ into a closed framework, that we will indicate by 
LISTNAT. 

The morphisms considered in this example are at the basis of the closure 
operations that we consider next. 



Closure. A closure is an extension that closes the meaning of some symbols. 
Here, we consider closure by internalisation, as defined in [30]. As shown in [30], 
internalisation can be used to implement constrained parameter passing, as well 
as to introduce objects as the closures of suitable open frameworks that represent 
classes. 

Let T{n Z\) = (A, X) be an open framework. An internalisation of an 
open symbol is one of the following operations: 

— Sort closure. The closure: CLOSE S' by s 

renames the open sort S by a sort s of the signature Ek of the closed kernel. 
No axioms are added. 

— Relation closure. The operation: CLOSE r by \/x,r{x) ^ R{x) 

closes r by the new closure axiom \/x . r{x) ^ R{x). The declaration of r may 
contain only sorts of Ek, and the defining formula R{x) is a A^y-formula. 

— Lunction closure. The operation: CLOSE / by yx,L{x,f{x)) 

closes / by the new closure axiom Vx, F(x, f{x)). The declaration of / may 
contain only sorts of Ek, and the defining formula L(x,y) is a A^y-formula 
such that Xk F Vx, 3ly , L{x,y). 

Let T{n A) = (A,X) be an open framework. A closure by internalisation 
is an internalisation that closes all the open sorts by closed sorts and all the open 
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relation and function symbols by a set Dn of closure axioms, and satisfies the 
following constraint satisfaction condition: 

XkUDh\- Xc ( 2 ) 

It produces the framework T' = (-17', (X\Xc)UZ?/ 7 , TUX^) where X' is obtained 
by replacing in X each open sort s with the sort s' closing s, Djj are the closure 
axioms of the open functions and relations, and, by (2), the constraints X^ have 
been deleted from the axioms and added to the theorems. 

Example 12. CISTNAT of Example 11 is a closure of CIST . It has been ob- 
tained by a sort closure and a relation closure, by the definition D^. Constraints 
are satisfied because 



Xf^at U X<| h TotalOrdering{<\) 

Now the total ordering axioms TotalOr dering {<i) are no longer constraints, but 
theorems. 

A different closure could be obtained, e.g., by closing <l by the reverse order- 
ing X <] y ^ y < X. 

A closure of a framework T should be an adequate closed extension of T . 
We can prove: 

Theorem 3 (Closure). Let T{II Z\) = (X,X) be an adequate framework, 
and T' = (X', (X \ Xc) U Du, T U Xc) he the result of a closure. Then T' is an 
adequate closed extension of T iff T' is consistent and atomically complete. 

The relation and function closures preserve consistency because Du are ex- 
plicit definitions in the kernel and, by the constraint satisfaction condition, Xc 
become theorems. Thus consistency is preserved if sort closures preserve the 
consistency of X \ Xc. A sufficient condition to preserve consistency is that no 
cardinality restrictions are imposed on the open sorts, as is commonly the case 
(like, e.g., the open sort X in generic lists). 

Concerning atomic completeness, let K, be the extension of the closed kernel 
of T by the closure axioms Du. Atomic completeness may be not guaranteed 
for two reasons: (a) K. is not atomically complete because Du are not adequate 
explicit definitions in the kernel, or (b) the atomic completeness of 1C is not suffi- 
cient to obtain the atomic completeness for the defined symbols, because stronger 
properties are required by the definition axioms. To avoid (a), Du must be ad- 
equate explicit definitions in the kernel. With quantifier free defining formulas, 
adequacy is guaranteed by Proposition 1. An example of (b) is the definition 
axiom r{x) ^ 3y ,p{x,y), where p is a parameter; in this case, K. should prove 
3p,p(x, y) or ^3y,p(x, y) for every ground x, i.e., atomic completeness of K. 
does not suffice. However, in general it is reasonable to look for definition axioms 
that close the defined symbols whenever the open ones become closed, i.e., case 
(b) should be the exception. Thus, if we do not use quantified defining formulas, 
closures by internalisation are, in general, adequate. 
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Closure may also be performed incrementally, step by step. A partial closure 
is called a specialisation, because it does not close all the open symbols. Besides 
partial closure, we may have other kinds of specialisation. For example: adding 
constraints, using open symbols in the defining formulas, mapping open sorts into 
non-closed sorts, and so on. All these operations can be formalised by extension 
morphisms, but we will omit relevant details. 

Example 13. The framework C2ST i of Example 11 is a specialisation of C2ST, 
obtained by the partial closure of X. 



Ftamework Composition. Framework composition essentially coincides with 
framework union. The simplest case is disjoint union. However, it may happen 
that we want to preserve a common part, for example natural numbers. Here 
we consider the composition of two frameworks and that have a common 
subframework Q containing their closed kernel, and have disjoint signatures for 
the symbols not in tj. In this case, composition preserving Q can be defined as 
the operation +g that builds the composite T\ Eg simply by making the 
union of signatures, open and closed symbols, and axioms. If T\ and share 
symbols not in Q , then we rename such symbols, to make them different before 
performing the union. 

+g is syntactic composition. Its semantic counterpart is amalgamation. Two 
intended models ii of Ei and D of are amalgamable if they coincide over 
the common signature. Their amalgamation is the interpretation u + u that 
coincides with ii over the signature of T\ and with I 2 over the signature of 
(the definition is consistent, because ii and I 2 coincide over the common 
signature). The intended models of iFi +g are the amalgamations of the pairs 
of amalgamable intended models of T\ and T^. 

This kind of composition has been formalised in ADT’s using pushouts (see 
e.g., [13]), and the pushout approach also works for frameworks, and it allows 
us to generalise the operation +g. We do not consider the general case here for 
conciseness. 

Example 14- Let BOOC be a framework defining booleans in the usual way. Lists 
of booleans with open ordering <l : [Ehol, Bool] can be defined starting from the 
disjoint union C2ST + BOOC, as follows: 

Framework C2S2 List Bool, nil, .,@,nocc) 

EXTENDS C2ST + BOOC; 
close: a by Bool; 

RENAME: ListX BY ListBool; 

We can compose lists of booleans C2ST 2 and lists of natural numbers C2S2 \ 
(see Example 11). To avoid duplicating the kernel of natural numbers, we perform 
the composition with common subframework MAE : 



C2S2 \ C2S2 2 
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To distinguish the non-common symbols, the composition renames them. Since 
we allow overloading, only sort and constant renaming may be needed. For 
example, we have nili : [] ^ ListNat, ] nih : [] ^ ListBool, overloaded 
. : [Nat, ListNat] ListNat and . : [Nat, ListBool] ListBool, and so on. 



3.2 Specification Reuse 

Specifications are used in two ways. Before composition or extension, they are 
used as a guide to search for possible compositional units that specify a desired 
context and set of operations. For example, if we need list sorting, we look for 
compositional units that contain the framework for lists with totally ordered 
elements, a specification Ssort of a sort operation, and a program P : ^ 

Ssort, ■ ■ ■■ After composition or extension, the compositional units guide program 
composition, according to Theorem 2. 

Reusability after composition or extension is enhanced by specification re- 
duction, as considered in [15]. Indeed, after extension or composition, we have a 
richer framework, where new properties have been added. It may happen that a 
specification can be reduced to a new specification, that is, in the new context 
the new specification can replace the old one. 

Informally, an “old” specification S reduces to a “new” specification S' if 
correctness with respect to the new S' entails correctness with respect to the old 
S. Formally, we give the following definition: 

Definition 14 (Specification Reduction). Let P be a framework, and S^j, 
S'^, be two sets of specifications in T . We say that S^j reduces to S'^, iff to C co' 
and F h S']],; ^ S,^ . 

For two interface specifications S.„.^ S^^ , S,r 2 Ss 2 , we say that S,n Ss^ 

reduces St^^ S^j iff S,r 2 reduces to St^^ and S^^ reduces to Ss 2 ■ 

Reduction is transitive and reflexive. Its meaning is made clear by Theorem 4: 

Theorem 4. Let F be a framework, and S.„.^ S^^ and St^^ Ss^ be two 

interface specifications. If St^^ Ss^ reduces to St^^ Ss 2 in F , then every 
program P that is correct with respect to S,t 2 Ss 2 is also correct with respect 
to S.„.^ Si^ (in F). 



Example 15. Let K be a compositional unit with open framework CXSF, and 
let Sihd be the strict specification Sihd '■ lhd{x,y) ^ x <]y. In the extension 
CISTMAT of CIST, S'l^^ : lhd{x,y) ^ x < y reduces to Sihd (we prove 
Sihd S'lf,^ by the closure axiom x <]y ^ x < y). Thus Sihd ^ Smerge (where 
S merge is defined in Example 8) reduces to S(^,^ Smerge, and we can use S(^,j 
when deriving correct merge programs. For example, we could write a correct 
program Pmerge’ ■ S(^^ Smerge which avoids comparisons with 0, since 0 
is the minimum natural number; Pmerge' would correctly override a (possibly) 
inherited Pmerge ■ Sihd ^ Smerge- 
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In the reduction of conditional specifications [15], we can take into account 
the call context. This is shown in the following example. 

Example 16. In the open framework CIST we can give the following specifica- 
tions: 

Emerge ■ ^ = x.uU A ord{m) {merge{l , iTi, o) ord{o) Aperm{x.m,o)); 
^spiit ■ split{x.l,m,n) ^ m = x.nil A n = 1. 

S split of Example 8 reduces to {Ssput Ssput can be proved in CIST). 

Smerge of Example 8 reduces to Emerge ici a call context where the input con- 
dition I = x.nil A ord{m) of holds for the lists I and m to be merged. 

Indeed, in such a context, S'[„^erge corresponds to merge{x.nil,m,o) ^ ord{o) A 
perm{x.m,o), Emerge to merge{x.nil,m,o) ^ ord{o) A perm{x.nil\m^o), and 
they are equivalent. We will say that Smerge contextually reduces to S'merge- 
Contextual reduction implies contextual reuse, that is. Emerge correctly re- 
duces to only when the input condition of is true. As a conse- 

quence, we cannot replace Emerge by Emerge ici isolation, but we have to consider 
the call context. In contrast, we can replace Sspiu by in isolation, because 

the corresponding reduction is not contextual. 

As we will see in Example 17, S'merge ^cid are tailored to the insertion 
sort algorithm. In a similar way, we can specialise Smerge and SspUt to obtain 
specifications tailored to different sorting algorithms, like merge sort, quick sort, 
and so on. 

In general, it is useful to list proven reduction theorems in the specification 
section of a compositional unit. Such a list would allow us to automatically search 
for families of program compositions, giving rise to families of implementations. 
It is for this reason that we have put RDi, ...RDk in Fig. 1. 



3.3 Program Reuse 

Like specifications, programs in compositional units can be used before and after 
unit composition. 

We use programs before composition when we look for existing compositional 
units containing specific algorithms. Otherwise, reuse is after unit composition, 
when we use the inherited programs to solve the problem in question. The op- 
eration that allows us to reuse the inherited programs is program composition. 
It is strongly guided by specifications. Specification reduction is important for 
program reusability, since it allows us to use the richer knowledge obtained af- 
ter framework composition and extension to solve the puzzle of composing the 
inherited open programs into a correct solution of the problem at hand. 

Example 1 7. Let K be a compositional unit with framework CIST , and let us 
assume that it already contains the correct program Pmsert ■ Sihd Smsert, 
where Pmsert implements the usual algorithm for inserting an element into its 
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correct position in an ordered list, Sihd is the specification shown in Example 
15, and Sinsert is" 

Sinsert ■ ord{l) {insert{x,l,m) ^ {ord{m) A perm{x .1 , m)) . 

We show how reductions of Example 16 can be used to solve the puzzle of 
obtaining a correct sorting program Qsort ■ Sihd Ssort- If we compose Pmsert 
with the correct one-clause programs 

Psplit ■ ^ split Splzti^X.l ^ X.Tltl ^ t 

Phnk ■ Sinsert ^ S'^erge merge{x.nil, I, o) ^ insert{x, I, o) 

we get a correct program Qaux ■ Sihd -S'^erge- By the specification re- 
ductions of Example 16, the interface specification ^ Ssort contex- 

tually reduces to SspUt^ Smerge Ssort- ThuS, the program Psort ■ S split 7 S merge 
Ssort of Example 9 is also correct with respect to Smerge Ssort, be- 

cause the input condition of Smerge is satisfied in the call context of merge, as 
required. By composing Psort and Qaux, we get a correct Qsort ■ Sihd ^ Ssort- 
Qsort can be closed in the instances that close Ihd. For example, Sihd reduces 
to in a compositional unit with framework CISTMAT , as shown in Example 
15. Suppose that our compositional unit already contains a correct program 
Pieq : Sieq- If we composc it with the correct one-clause program 

Pihd ■- Sieq ^ S'lhd lhd{x, y) ^ leq{x, y) 

we get a closed correct program Qihd Sihd- By specification reduction we get 
that Qihd '- S'lf^^ is also correct. Then the closed program QihdAiQsort ■ Ssort 
is correct in CISTAfAT . 

3.4 Related Work 

At the framework level, our approach to modularity and reuse is in the tradition 
of algebraic ADT’s [2,13,47]. We can apply the techniques developed there, based 
on theory morphisms. Our specification frameworks should not be confused with 
the specification frames introduced in [25]. The latter, like institutions [19], are 
general frames for the composition and reuse of formal theories. With respect to 
modularity and compositionality, our frameworks with open symbols and defined 
symbols are similar, for example, to modules with import and export interfaces, 
as introduced in [14]. 

In [25] , a distinction between parameterised specifications and parameterised 
data types is introduced, following [42]. In [42], programs and specifications 
are considered as different entities, involved in different phases and different 
methodological aspects of program development, and a distinction between pa- 
rameterised specifications and specifications of parameterised programs is in- 
troduced. In this, [42] is very close to our general view, but our approach is 
different. Our three- level architecture of compositional units is closer to Larch 
[20]. Like Larch, our specifications state precisely how open programs interact. 
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and allow us to compose them correctly. However, unlike Larch, we have a further 
specification level, which is intermediate between the framework level and the 
interface specification level. This yields a further level of correct reuse, through 
the specification reduction theorems. 

With regard to modularity in logic programming, there are approaches based 
on ideas similar to our j-models (see [7]), while the approach proposed in [39] 
relates to specification frames [25] . However, all these approaches do not distin- 
guish between specifications and programs. A distinction between programs and 
specifications is made in [40], where modular Prolog programs (as proposed in 
[44]) are derived from first-order specifications (based on Extended ML [43]). 
However, in [40], the role of specifications is different from ours, and there is no 
counterpart of specification frameworks. 

Finally, in the area of object-oriented analysis and design, component-based 
development methods [12,3] have emerged, where components and reuse are two 
of the main aspects of the software development process. In this area, a soft- 
ware component is a unit of composition with contractually specified interfaces 
and context dependencies only [46]. Our compositional units broadly fit this 
characterisation, considering interface specifications Ss as interfaces, and 

specifications and their reducibility relation in the context of the framework as 
context dependencies. 

4 Conclusion 

In this paper we have essentially collected our previous work on program spec- 
ification and synthesis, and we have organised it by introducing compositional 
units, which are a more complete and refined version of correct schemas [16]. 
Then we have illustrated the basic operations for extending and correctly reusing 
(composing) compositional units. 

A compositional unit is a unit of reuse that contains both a formalisation of 
the problem domain, at the framework level, and a collection of open programs, 
correct with respect to their specifications, at the specification and program 
levels. The framework level specifies, by the constraint axioms, when and how a 
compositional unit can be correctly reused. The specification and program levels 
support program reuse and development. The examples of Section 3 have been 
mainly devoted to illustrating the role of specifications in the correct reuse of 
compositional units for program development. In particular, specifications are 
a guide for program composition, and specification reduction allows us to deal 
with the problem of adapting the inherited open programs to the specific context 
of reuse. 

In this paper, we have not considered program synthesis, because we con- 
centrated on specifications and their role in the reuse of compositional units 
and correct open programs. However, there is a strong relationship to logic pro- 
gram synthesis [1 1] , and indeed our research started in this area. An interesting 
fact is the possibility of using logic program synthesis as a way for expanding 
frameworks in an adequate way [27] . 
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The distinct levels for specifications and programs distinguish our approach. 
At these levels, we have integrated our research on steadfast open programs and 
specifications. As we have shown, specifications and steadfast programs yield a 
further level of reuse, through specification reduction and program composition. 
We believe that this is an important feature of our approach, especially in the 
context of so-called software components [46] . Our future work will be devoted 
to the study of the applicability of our approach to the development of correct 
component-based software. 

On the one hand, we want to develop the approach further, based on logic 
programs, along the following two lines: (a) We will extend our approach to 
other kinds of logic programs. For constraint logic programs and those normal 
programs that have one intended model, the extension of our results is almost 
immediate, (b) We will study methods for deriving steadfast programs from their 
interface specifications, based on our compositional units and on the results 
of [34] and the ideas exposed in [16]. To this end, tools would be necessary 
for developing an interactive environment where we can define and compose 
specification frameworks, specifications and programs, and use a proof assistant 
for developing the necessary proofs. We are looking at logical frameworks like 
Isabelle [23] as possible candidates. 

On the other hand, we want to consider the extension of our approach to 
different programming paradigms. This can be done in two ways. The first choice 
is to define, on top of specification frameworks, different specification formalisms, 
oriented to different program languages. Such formalisms would provide different 
interface specification languages, in a way similar to Larch [20]. The second 
choice is to use our compositional units as meta-level declarative specifications 
of systems implemented in possibly imperative programming languages. 

So far, we have considered only the second choice. We began a study of 
object-oriented systems, with the aim of testing the versatility of our model and, 
hopefully, of obtaining a formalisation of object-oriented compositional units 
that could be used as software components. In [32], we introduced a static model 
of object-oriented systems, suitable for formalising states and queries. Our static 
approach shares similarities with [6,36], and allows us to formalise UML class 
and object diagrams [41], queries and OCL constraints [9]. The introduction of 
time in our object-oriented systems is work in progress. 

Our final goal is to obtain a methodology for the specification and the de- 
velopment of correct component-based software, where programs are developed 
together with the formal proof of their correctness. This methodology should 
allow the development of correct compositional units to be used as software 
components, that is, units of composition that can be deployed independently 
and are subject to composition by third parties [46]. 
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Abstract. Since the early days of programming and automated reason- 
ing, researchers have developed methods for systematically constructing 
programs from their specifications. Especially the last decade has seen a 
flurry of activities including the advent of specialized conferences, such 
as LOPSTR, covering the synthesis of programs in computational logic. 
In this paper we analyze and compare three state-of-the-art methods for 
synthesizing recursive programs in computational logic. The three ap- 
proaches are constructive/deductive synthesis, schema-guided synthesis, 
and inductive synthesis. Our comparison is carried out in a systematic 
way where, for each approach, we describe the key ideas and synthesize 
a common running example. In doing so, we explore the synergies be- 
tween the approaches, which we believe are necessary in order to achieve 
progress over the next decade in this field. 



1 Introduction 

Program synthesis is concerned with the following question: Given a not nec- 
essarily executable specification, how can an executable program satisfying the 
specification be developed? The notions of “specification” and “executable” are 
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here interpreted broadly. The objective of program synthesis is to develop meth- 
ods and tools to mechanize or automate (part of) this process. 

In the last 30 years, program synthesis has been an active research area; see 
e.g. [14,4,40,13,26,29] for a description of major achievements. The starting point 
of program synthesis is usually a formal specification, that is an expression in 
some formal language (a language having a syntax, a semantics, and usually a 
proof theory) . Program synthesis thus has many relationships with formal speci- 
fication [69]. As the end product is a verified correct program, program synthesis 
is also related to formal methods in the development of computer systems [22], 
and to automated software engineering. All of these disciplines share the goal of 
improving the quality of software. 

Program Synthesis in Computational Logic. It is generally recognized 
that a good starting point for program synthesis is to use declarative formalisms 
such as functional programming or computational logic, where one specifies what 
a program should do instead of how. We focus here on the synthesis of recur- 
sive programs in computational logic, which provides an expressive and uniform 
framework for program synthesis. On the one hand, the specification, the result- 
ing program, and their relationship, can all be expressed in the same logic. On 
the other hand, logic specifications can describe complete specifications as well 
as incomplete ones, such as examples or properties of the relation that is to be 
computed. Since all this information can be expressed in the same language, it 
can be treated uniformly in a synthesis process. 

There exist many different approaches to program synthesis in computational 
logic and different ways of viewing and categorizing them. For example, one can 
distinguish constructive from deductive synthesis. In constructive synthesis, a 
conjecture based on the specification is constructively proved, and from this 
proof a program is extracted. In the deductive approach, a program is deduced 
directly from the specification by suitably transforming it. As will be shown 
in this paper, these two approaches can profitably be viewed together and ex- 
pressed in a uniform framework. In a different approach, called schema-based 
synthesis, the idea is to use program schemas, that is some abstraction of a 
class of actual programs, to guide and enhance the synthesis process. Another 
approach is inductive synthesis, where a program is induced from an incomplete 
specification. 

Objectives. Our intent in this paper is to analyze and compare three state-of- 
the-art methods for synthesizing recursive programs in computational logic. The 
chosen approaches are constructive/deductive synthesis, schema-guided synthe- 
sis, and inductive synthesis. We perform our comparison in a systematic way: we 
first identify common, generic features of all approaches and afterwards we use 
a common example to explain these features for each approach. This analysis 
forms the basis for an in-depth comparison. We show, for example, that from 
an appropriately abstract viewpoint, there are a number of synergies between 
the approaches that can be exploited. For example, by identifying rules with 
schemas, all three methods have a common, underlying synthesis mechanism 
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and it becomes easier to see how the methods can be fruitfully combined, or dif- 
ferentiated. Overall, we hope that our comparison will deepen the communities 
understanding of the approaches — their relationships, synergies, where they 
excel, and why — and thereby contribute to achieving progress in this field. 

We see this paper as complementary to surveys of program synthesis in com- 
putational logic (or more precisely in logic programming), in particular [26,29]. 
Rather than a making a broad survey, we focus on the analysis and in-depth 
comparison of the different approaches and we also consider schema-guided syn- 
thesis. Due to lack of space and to comply with our objectives, some technical 
details are omitted. Here, the reader may rely on his or her intuitive understand- 
ing of relevant concepts or follow pointers to references in the literature. 

Organization. Section 2 presents the different elements that will be used to 
present and compare the chosen synthesis approaches. These elements include 
general features of program synthesis approaches as well as the example that 
will be used for their comparison. Sections 3 through 5 describe the three cho- 
sen approaches: constructive/deductive synthesis, schema-guided synthesis, and 
inductive synthesis. To facilitate a systematic analysis and comparison of the 
methods, each section has a similar structure. Section 6 compares the three 
approaches. Finally, Section 7 draws conclusions and presents perspectives for 
future developments. 

2 Elements of Comparison 

In the subsequent sections, we will present three synthesis approaches. For each 
approach, one representative method is described. However, before describing 
them, we first present their general features. These features are developed in the 
context of each particular method and serve both to facilitate our analysis and 
systematize our comparison. We also introduce our example. 

2.1 General Features 

Specification. The starting point for program synthesis is a specification ex- 
pressed in some language. For each synthesis method, we must fix the specifi- 
cation language and the form of the specification (e.g., a formula or a set of 
examples). 

Mechanism. Program synthesis methods are based on calculi and procedures 
prescribing how program are synthesized from specifications. Although the un- 
derlying mechanisms of the various systems differ, there are, in some cases, 
similar underlying concepts. 

Heuristics. Program synthesis is search intensive and heuristics are required in 
practice to guide the synthesis process. Are the heuristics specific to a synthesis 
method or are there common heuristics? How effective are the heuristics in the 
different methods and to what extent do different methods structure and restrict 
the search space? 
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Background Knowledge. Usually, non-trivial specifications refer to back- 
ground knowledge that formalizes information about the properties of objects 
used in the specification, e.g., theories about the relevant data types. 

Human Interaction. Human interaction involves two different issues. First, 
how much can a human be automatically assisted? Second, what is the nature of 
human-computer interaction in synthesis? How can the human step in and, for 
example, give key steps rather than leave the matter to blind search? Allowing 
input at critical points requires appropriate system support. 

Tool Support. What kind of tool support is needed for turning a synthesis 
method into a viable system? 

Scalability. Scalability is a major concern in program synthesis. Synthesis 
systems should not only be able to synthesize small simple programs, but they 
should also be able to tackle large or complex programs that solve real-life prob- 
lems. 



2.2 The Chosen Example 

The same example will be used throughout the paper to facilitate a comparison 
of the different methods. We have chosen a problem simple enough to present 
in full, but complex enough to illustrate the main issues associated with each 
approach. 

Specification 21 Let L he a list, I a natural number, and E a term. The rela- 
tion atpos{L,I,E) holds iff E is the element of L at position I. By convention, 
the first element of a list is at position 0. The atpos relation can he formally 
specified as follows: 

atpos{L, I, E) ^ 3P, S . append{P, E ■ S, L) A length{P, I) 

where append and length have their usual meaning, and are assumed to be defined 
in the background theory. 

In the formula above, and in the rest of the paper, free variables are assumed 
to be universally quantified over the entire formula. As list notation, we use nil 
to represent the empty list, and El ■ T for the list with head H and tail T. 

3 Constructive and Deductive Synthesis 

We will now look at two approaches to synthesizing programs that are often 
grouped together: constructive and deductive synthesis. We shall highlight their 
similarities by viewing both from the same perspective: In both cases, deduc- 
tion can be used to synthesize programs by solving for unknowns during the 
application of rules. 
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3.1 Background 

For historical reasons, and because the ideas are simplest to present there, we 
begin by considering synthesis of functional programs in constructive type the- 
ory. 

Constructive type theories are logics used for reasoning about functional 
programs. The simplest example is the simply typed A-calculus [5,48], which we 
briefly review here. Programs in the simply typed A-calculus are terms in the 
A-calculus, which are built from variables, application, and abstraction. Types 
are built from a set of base types, closed under the function space constructor 

One reasons about judgments that assert that a term t has a type T, relative 
to a sequence of bindings F, of the form xi : Ai,. . . ,Xn '■ An, which associate 
variables to types. The valid judgments are inductively defined by the following 
rules: 

X : Ag r r,x : Ah M : B 

hvp abst 

Fhx:A Fh {Xx. M) : {A ^ B) 

Fh M : A^ B Fh N : A 

appl 

F h (MN) : B 

These rules comprise a deduction system for proving that a program t has 
a type T. Under the propositions-as-types interpretation, this type may also be 
understood as a logical proposition (reading as intuitionistic implication) 
that specifies t’s properties. Of course, the specification language is quite weak, 
so it is difficult to specify many interesting properties. In stronger type theories, 
such as [24,56], types correspond to propositions in richer logics and one can, for 
example, specify sorting as 

h t : (Vx : int list . 3y : int list . perm{x, y) A ord{y)) . (1) 

This asserts that the program t is a function that, on input x, returns an ordered 
permutation y. 

The given deduction system can be used for program verification: given a 
program t and a specification T, prove h t : T. For example, for p and q types, 
we can verify that the program Xx. Xy. x satisfies the specification p ^ {q ^ p): 

X : p h X : p, y : q 

^ hyp 

X : p, y : qh X : p 

abst (2) 

X : ph Xy. X : q ^ p 

abst 

h Xx. Xy. X : p ^ {q ^ p) 

Perhaps less obviously, the same rules can be used for program synthesis: 
given a specification T , construct a program t such that h t : T . This can be 
done by 

1. Reversing the direction in which rules are applied and proofs are constructed. 

That is, build the proof in a goal-directed, “refinement style” way by starting 

with the goal and working towards the axioms. 
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2. Leaving the program t as an unknown, or metavariable, which is solved 
during proof. 

Let’s try this out in the example above. Using capital letters to indicate 
metavariables, we begin with 



h i? : p ^ (g ^ p) . 

Resolving this with the (conclusion of the) abst rule yields the new goal 

a: : p h : (g ^ p) , 

where R is unified with Xx. i?i(a;). Applying abst again results in 

X :p,y:q\- R 2 {x,y) : p, 

where = Xy.R 2 {x,y). Finally, applying hyp unifies the assumption x : p 

with R2{x, y) : p, instantiating R 2 {x, y) to x and completing the proof. Compos- 
ing the substitutions yields the previously verified program t = Xx. Xy. x. 

The account above is complicated by the fact that the abstraction operator A 
binds variables and, to work properly, higher-order unification is required when 
applying rules. The rules constitute clauses in a higher-order (meta-)language 
and proofs are constructed by higher-order resolution. A higher-order logic pro- 
gramming language or logical framework based on higher-order resolution like 
A-Prolog [27], ELF [61], or Isabelle [59] would support this kind of proof. 

There are two conclusions we would like to draw. First, verification and 
synthesis are closely related activities. In fact, when rules are applied using 
(higher-order) resolution, they are essentially identical. The only difference is 
whether unification is between ground or non-ground terms, i.e., whether or 
not an answer substitution is built. This conclusion should not be surprising to 
those working in logic programming: the same sequence of resolution steps can 
be used to establish a ground query p(t) or a non-ground one p(AT), generating 
the substitution X = t. 

Second, constructive synthesis is of a deductive nature and the line between 
the two can be fine. As the analogy with Prolog shows, proofs construct objects. 
In type theory, the objects are programs. Indeed, the idea of proofs synthesizing 
programs, sometimes called proofs- as-programs, can be decomposed into 

proofs- as-programs = proofs-as-objects -t- objects- as-programs. 

In our example, unification, not the constructivity of the logic, is responsible 
for constructing an object. Constructivity does not play a role in the synthesis 
of objects, but rather in their execution and meaning. That is, because the 
logic is constructive, the synthesized terms can be executed and their evaluation 
behavior agrees with the semantics of the type theory. In contrast, [49], for 
example, presents a classical type theory where programs correspond to (non- 
computable) oracles that cannot be executed. There one might say that the 
line is crossed from constructive (and deductive) program synthesis to deductive 
object synthesis. 
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The use of unification is at the heart of deductive and constructive synthesis. 
Unification is driven by resolution, to synthesize, or solve for, programs during 
proofs. This idea goes back to work in the 1960s on using first-order resolution 
to construct terms that represent plans or, more generally, programs [19,42]. In 
the logical framework community, the use of higher-order metalogics to represent 
rules and the use of higher-order unification to apply them is now standard, e.g., 
[2,8,9,23]. For example, the Isabelle distribution [59] comes with encodings of a 
number of type theories, where programs can be synthesized as described here. 

The vast majority of approaches for synthesizing logic programs are based 
on first-order reasoning, e.g., equivalence preserving transformations. There have 
been many proposed methods and [26] contains a good survey. They differ in 
the form of their axioms (Horn clauses, z/f -definitions, etc.), exact notion of 
equivalence used (and there are many, see e.g., [55]), and ease of automation. 
Many of these, for example unfold- fold based transformations [60], can be recast 
as synthesis by resolution using rules like those presented here [7,10]. 



3.2 Overview 

Specifications. In type theory, programs and specifications belong to differ- 
ent languages. When synthesizing logic programs, the specification language is 
typically the language of a first-order theory and the programming language is 
some suitable, executable subset thereof. By sharing the same language, logic 
programs are well suited for deductive synthesis where specifications are manip- 
ulated, using equivalence preserving transformations, until a formula with some 
desired form or property is reached. 



Mechanism. The mechanism for synthesizing logic programs during proofs is 
essentially the same as what we have just seen for type theory. However, what is 
proved (i.e., the form of the theorem to be proven), and the proof rules used to 
establish it, are of course different. Namely, we will prove theorems about equiv- 
alences between specifications and programs and we will prove these theorems 
using rules suitable for establishing such equivalences. 

For our example, we will employ the following rules: 



re/? 



^ ^ H 2 ^ ^ F?2 

\J— split 

V A 2 ) V B 2 ) 



In addition, for building recursive programs that recurse over lists we employ 
the rule schema 



A\ A 2 A 3 

— — — ind , 

VL,X. P(L,X) ^ Q{L,X) 



where L is a variable ranging over lists, X denotes sequences of zero or more 
variables of any type, and the assumptions Ai are: 



Ai=yL,X .Q{L,X) ^ {L = nilAB{X)) _ 

V3H, T.L = H -T A S{H, T, X) 
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A 2 =^X. P{nil, X) ^ B{X) 

A 3 =\/T. (VX . P{T, X) ^Q{T,X))^ WH, X . P{H ■ T, X) 

^ S{H, T,X) 

This rule, which can be derived by induction on the list L, states the equiv- 
alence between predicates P and Q (which are metavariables). For the purpose 
of synthesis, we can take Ai as the definition of Q, and A 2 and A^ constrain 
(and will be used to define) Q's base and recursive cases. In A 3 , we are allowed 
to use the existence of Q, when defining Q, but only on smaller arguments. 

We will show below how, by applying these rules (using higher-order resolu- 
tion), we can construct R while proving its equivalence to atpos. 

Heuristics and Human Interaction. Proof rules, like those given above, 
can be applied interactively, semi-interactively, or even automatically. The use 
of a tactic based theorem prover [41], which allows users to write programs that 
construct proofs, leaves open the degree of automation. 

[50,51], for example, show how to completely automate the construction of 
such synthesis proofs in a tactic based setting. In this work, the most important 
tactic implements the rippling heuristic of [17,12]. This heuristic automates the 
application of rewrite or equivalence preserving transformation rules in a way 
that minimizes differences between terms or formulas. Rippling is typically used 
in inductive theorem proving to enable the use of the induction hypothesis in 
simplifying the induction conclusion and it can be used in a similar way dur- 
ing program synthesis where rules that introduce recursion (like ind) produce 
induction-like proof obligations. Rippling has been used to automate completely 
the synthesis of a number of non-trivial logic programs. However, it should be 
noted that some interaction with the user is often desirable since the application 
of proof rules, in particular rules that build recursive programs, determines the 
efficiency of the synthesized program. 

Background Knowledge. The approach we present here for synthesizing 
logic programs involves two kinds of rules. The first kind are rules, like re/? 
and \/— split, which are derived rules of first-order logic. These derived rules 
are not, strictly speaking, necessary (provided we are working in a complete 
axiomatization of first-order logic), but their addition makes it easier to construct 
synthesis proofs by reasoning about equivalences. The second kind of rules are 
theory specific rules, e.g., rules about inductively defined data types like numbers 
and lists. The rule ind given above is an example of such a rule. It is derivable 
in a theory that axiomatizes lists and formalizes induction over lists. 

Tool Support. For synthesizing the atpos example, we have used the Isabelle 
system. Isabelle’s basic mechanism for proof construction is top-down proof by 
higher-order resolution, which is precisely what we require. Moreover, as a logi- 
cal framework, Isabelle supports the derivation of new rules, so we can formally 
derive, and thus insure the correctness of, the specialized rules needed for synthe- 
sis; in our example, we derive the rules just presented in a standard first-order 




38 



David Basin et al. 



theory of lists. Finally, tactics can be used to partially, or entirely, automate 
proof construction. The Isabelle distribution comes with simplifiers and decision 
procedures that we used to semi-automate synthesis. 

Scalability. The search space in most approaches to deductive synthesis is 
quite large. In practice, building non-trivial programs requires an environment 
that supports heuristics for automating simple proof steps, e.g., by the applica- 
tion of tactics. It is also important that the user can safely augment a synthesis 
system with derived rules. As we will later observe, schemas, for schema guided 
synthesis, can be seen as derived rules specialized for synthesizing programs of a 
particular form, and their integration with deductive synthesis approaches can 
help with large scale developments. Examples of this are provided in [1]. 

3.3 Example 

Let us illustrate our synthesis method on the atpos example. We wish to con- 
struct a logic program equivalent to the specification 21. As with synthesis in 
the type theory, we use a metavariable, R, to stand in for the desired program. 
Hence we start with 



h VL, I, E . atpos{L, I, E) ^ R{L, I, E) . (3) 

Working backwards, resolving (using higher-order unification) this conclusion 
with the conclusion of the ind rule yields the three subgoals 

VL, /, E . R{L, /, E)^ {L = nil A B{I, E)) 

V3H, T.L = H -T A S{H, T, /, E) 

V/, E . atpos{nil, I, E) ^ B{I, E) 

VT. (V/,L. atpos{T,I,E) ^ 

R{T, /, E) ViL, I, E . atpos {H ■ T, I, E) ^ S{H, T, /, E) 

and Q is unified with R. 

The first subgoal constitutes a program template, which will later be filled 
out by solving the other subgoals. In the second subgoal, expanding the definition 
of atpos results in 

h V/, E . (3P, S . append{P, E ■ S, nil) A length{P, I)) ^ B{I, E) . 

Let / and E be arbitrary. To show 

h (3P, S . append{P, E ■ S, nil) A length{P, I)) ^ B{I, E) , 

observe that there are no values for P or S for which append{P,E ■ S,nil) is 
true. Hence this subgoal is equivalent to 

h false ^ B{I, E) . 

We can complete the proof with which unifies B{I,E) with false. 
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For the third subgoal, we assume the existence of an arbitrary list T and 
the antecedent of the implication (which amounts to an induction hypothesis) 
and must prove the consequent (the induction conclusion). Hence, expanding 
the definition of atpos, we assume 

V/, E . (3P, S . append{P, E ■ S,T) A length{P, /)) R(T, I, E) 

and we must prove, for some arbitrary H, I, and E, 

h (3P, S . append{P, E ■ S , El ■ T) A length{P, /)) ^ S{H, T, /, E) . 

Now, since P ranges over lists, for any formula 3P.(f){P) is equivalent 
(by case analysis) to (j>{nil) V 3H, T . (j>{H ■ T). Hence, the above is equivalent to 

h ((35 . append{nil, E ■ S , H ■ T) A length{nil, /)) 

V (3i5', T', S. appendin' ■ T' , E ■ S , H ■ T) A length{H' ■ T, /))) 

^ S{H, T, I, E). 

We proceed by decomposing the disjunction on the left-hand side by resolving 
with V— split. Doing so builds a disjunction for S, by instantiating S{H, T, I, E) 
with Si{H, T, I, E) V S 2 {H, T, I, E), and yields the two subgoals: 

h 35 . append{nil, E ■ S , H ■ T) A length{nil^ I) ^ 5i(i5, T, I, E) 
h 3H', T', 5 . appendin' -T',E ■ S,n -T) 

Alengthin' ■ T' , I) ^ 52(i5, T, /, E) 

For the first, the left-hand side is true whenever 3S . E = n A S = T A I = 0. 
Hence, setting 5 to T, this subgoal is equivalent to 

E iE = n AI = 0) ^ Siin,T, /, E) . 

We can again discharge this using ^—refl, which unifies Siin,T, I, E) with 
E = n A I = 0. Now, under the standard definition of append and length, the 
second subgoal is equivalent to 

h (3/'.s(J') =I A (3T', S. appendix', E-S,T)A lengthiT' , I'))) 
^S2in,T,I, E) 

where s(/') represents the successor of I'. We can now simplify this using the 
antecedent (induction hypothesis), which yields 

(3/'.s(/') =I A RiT, I', E)) ^ S 2 in, T, I, E) . 

We complete the proof with ^—refl, unifying S 2 in,T, I, E) with 3/'.s(/') = 
I A RiT, r,E). 

We are done! If we apply the accumulated substitutions to the remaining 
assumption Hi we have 

'iL,I,E. RiL,I,E) 

^ (L = nil A false) 

y3n, T.L = n- TAiiE = nAi = o) 

V 3/'. s(/') = I ARiT,T,E)) . 
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and we have proved the equivalence of (3) under this definition, i.e., 
atpos{L,I,E) is equivalent to the synthesized instance of R{L,I,E). 

The alert reader may have wondered why we did not complete the proof 
earlier by resolving with ^—refl. In this example, our goal was to transform 
atpos so that the result falls within a particular subset of first-order formulae, 
sometimes called pure logic programs [16] or logic descriptions [25], that define 
logic programs. These formulae can be easily translated to Horn clauses or run 
directly in a language like Godel [47]. In this case, we get the clauses: 

atpos{nil, I, E) <— false 
atpos{H -T,I,E) ^ E = H,I = Q 
atpos{H ■ T, I, E) s{I') = I, atpos{T, I' , E) 

which can be simplified to 

atpos{E ■ 0, E) ^ 

atpos{- • T, s{I'), E) ^ atpos{T, I' , E) 



3.4 Analysis 

Overall, when cast in this way, the deductive synthesis of logic programs is quite 
similar to the previous constructive/deductive synthesis of functional programs. 
In both cases, we leave the program as an unknown, and solve for it, by uni- 
fication, during proof. Of course, the metatheoretic properties of the programs 
produced are quite different. In the case of logic program synthesis, the rules, 
as they are given, do not enforce that the object constructed has any special 
syntactic properties (e.g., is a pure logic program); we only know that it is an 
equivalent formula. Moreover, we do not a priori know anything about its ter- 
mination behavior (although it is not difficult to show that the induction rule 
builds predicates that terminate when the first argument is ground). 

This kind of development, as with most approaches to logic program synthe- 
sis, is best described as deductive synthesis. They are constructive only in the 
weak sense that, at the metalevel (or metalogic, if one is carrying out the proof 
in a logical framework), one is essentially proving a theorem of the form 

3R . VL, /, E .atpos{L, I, E) ^ R{L, /, E) 

and building a witness (in this case, a predicate definition) for R. (For more 
on this notion of constructivity and the proof theory behind it, see [11].) Many 
proposed methods for the constructive synthesis of logic programs can also be 
explained in this way. For example, the Whelk Calculus of [71], which is moti- 
vated by experiments in synthesizing relations in a constructive type theory, can 
be recast as this kind of synthesis [6] . 
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4 Schema-Guided Synthesis 

We here outline Flener, Lau, Ornaghi, and Richardson’s definition, representa- 
tion, and semantics of program schemas: see [33] for details. 



4.1 Background 

Intuitively, a program schema is an abstraction of a class of actual programs, 
in the sense that it represents their data-flow and control-flow, but neither con- 
tains all their actual computations nor all their actual data structures. Program 
schemas have been shown to be useful in a variety of applications. In synthesis, 
the main idea is to simplify the proof obligations by taking the difficult ones 
offline, so that they are proven once and for all at schema design time. Also, the 
reuse of existing programs is made the main synthesis mechanism. 

A symbol occurring in a theory T is open [52] in T if it is neither defined in T, 
nor a predefined symbol. A non-open symbol in T is closed in T. A theory with at 
least one open symbol is an open theory; otherwise it is closed. This terminology 
applies to formal specifications and logic programs. An (open) program for a 
relation r is steadfast [25,53] with respect to its specification if it is correct with 
respect to its specification whenever composed with programs that are correct 
with respect to the specifications of its (open) relations other than r. 

Among the many possible forms of programs, there are the divide- and- conquer 
programs with one recursive call: if a distinguished formal parameter, called the 
induction parameter, say X, has a minimal value, then one can directly solve 
for a corresponding other formal parameter, called the result parameter, say Y ; 
otherwise, X is decomposed into a “smaller” value T (under some well-founded 
relation ^) by splitting off a quantity H, so that a sub-result V corresponding to 
T can be computed by a recursive call, and an overall result Y can be composed 
from H and V . A third formal parameter, called the passive parameter, say Z, 
participates unchanged in these operations. Formally, this problem-independent 
dataflow and control-flow can be captured in the following open program for r: 

r(A, y, Z) ^ min{X, Z), solve{X, Y, Z) 
r(X, Y, Z) ^ ^min{X, Z),dec{X, Z, H, T), {DC) 

r{T, V, Z), comp{H, Z, V, Y) 

The relations min, solve, dec, comp are open. When I is the induction parameter, 
L the result, and E the passive parameter, so that atpos{L, I, E) ^ r{I, L, E), a 
closed program for atpos is the instance of DC under the program substitution 

min{X, Z)^X = 0 solve{X, Y,Z) ^Y = Z ■ S , . 

dec{X, Z,H,T) ^ X = s{T) comp{H, Z,V,Y) ^Y = F -V 

This substitution captures the problem- dependent computations of that program. 

But programs by themselves are syntactic entities, hence some programs 
are undesired instances of open programs. For instance, the generate-and-test 
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program r{X,Y,Z) ^ g{X,Y,Z), t{Y, Z) is an instance of DC under the sub- 
stitution 

min{X, Z) ^ true solve{X, Y, Z) ^ g{X, Y, Z), t{Y, Z) 
dec{X, Z, H, T) ^ true comp{H, Z, V, Y) ^ true 

An open program such as DC thus has no fixed meaning. The knowledge cap- 
tured by an open program is not completely formalized, and the domain knowl- 
edge and underlying language are still implicit. In order for such open programs 
to be useful for guiding synthesis, such undesired instances need to be prevented 
and some semantic considerations need to be explicitly added. 

A program schema [33] has a name, a set of formal sort and relation pa- 
rameters, a signature with sorted relation and function declarations, a set of 
axioms defining the declared symbols, a set of constraints restricting the actual 
parameters, an open program T called the template, and specifications S of the 
relations in T, such that T is steadfast with respect to S in that axiomatization. 

The schema DC can be abduced, as in [32], from our informal account of how 
divide-and-conquer programs work. The parameters SX, SY, SZ, SH are sorts; 
they are used in the signatures of the other parameters, which are relations. 
There are no axioms because the signature declares no other symbols than the 
parameters. The template is the open program DC, which defines the relation r 
and has min, solve, dec, comp as open relations. The closed relation r is specified 
by Sr, and the open relations have Smin, S solve, Sdec, Scomp as specifications. 
The conditional specification Sr exhibits ir, Or as the input/output conditions 
of r, while Sdec exhibits idee, Odec as the input/output conditions of dec. The 
input/output conditions of the remaining open relations are also expressed in 
terms of the parameters A, idee, Or, Odec- The constraints restrict dec to succeed 
at least once if its input condition holds, and then to yield a value that satisfies 
the input condition of r (so that a recursive call to r is “legal”) and that is 
smaller than X according to which must be a well-founded relation (so that 
recursion terminates). The open program DC is steadfast with respect to Sr, 
within the given axiomatization. 

In the schema TZ£US£, the parameters SX, SY, SZ are sorts; they are used 
in the signatures of the other parameters, which are relations. There are no 
axioms because the signature declares no other symbols than the parameters. 
The template is the open program {r{X, Y, Z) ^ q{X, Y, Z)}, which defines the 
relation r and has q as the open relation. The relation r is specified by Sr, and the 
relation q has the same input/output conditions as r. There are no constraints 
on the parameters. This schema provides for the reuse of a program for q when 
starting from a specification for r. The open program Reuse is steadfast with 
respect to Sr, within the given axiomatization. 

4.2 Overview 

Let us now examine the specifications, mechanism, heuristics, background knowl- 
edge, human interaction, tool support, and scalability of schema-guided synthe- 
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Schema VC{SX, SY, SZ, SH , ^ ; Or 5 idee ; Odec) 

SORTS: SX,SY,SZ,SH 



RELATIONS: ir,idec : (SX, SZ) ^ : (SX, SX) 

Or :(SX,SY,SZ) Odec : (SX,SZ,SH,SX) 



AXIOMS: 


(none) 




CONSTRS: 


idec(A, Z) ->3H :SU.3T :SX. Odec(A, Z, H, T) 


(Cl) 




idec(X, Z) A Odec(X, Z, H, T) — > ir(T, Z) AT ^ X 


(C 2 ) 




wellFounded(X) 


(Ca) 


SPECIFS: 


ir(X, Z) ^ ( r(A, y, Z) ^ Or(X, y, Z) ) 


(Sr) 




ir(X, Z) ( min(X, Z) ^ ^idec(X, Z) ) 


(5'min) 




ir(X, Z) A -idee (A, Z) ^ ( S0lve(X, y Z) ^ Or(X, Y, Z) ) 


solved 




idee (A, Z) ^ ( dec(X, Z, H, T) ^ Odec(X, Z, H, T) ) 
Odec(X, Z, H, T) A Or(T, V, Z) —> 


(Sdec) 




( comp(H, Z, V, Y) ^ Or(X, Y, Z) ) 


(^Scomp^ 


TEMPLATE: 


r(X, Y, Z) ^ min(X, Z), solve(X, Y, Z) 
r (X, Y, Z) ^ -nmin(X, Z), dec(X, Z, H, T), 


(DC) 



r(T, V, Z), comp{H, Z, V, Y) 



Schema TZSUSS{SX, SY, SZ, ir,Or) 
SORTS: SX, SY, SZ 

RELATIONS: ir : (SX, SZ) Or : 

AXIOMS: (none) 

CONSTRAINTS: (none) 

SPECIFICATIONS: ir{X,Z) ( r{X,Y,Z) 
ir{X,Z)^{q{X,Y,Z) 
TEMPLATE: r{X, Y, Z) ^ q{X, Y, Z) 



(SX,SY,SZ) 

^Or{X,Y,Z) ) 
^Or[X,Y,Z) ) 



(Sr) 

is,) 

(Reuse) 



Specifications. Among the many possible forms of specifications, there are 
the classical conditional specifications: under some input condition A on inputs 
X, Z, a, program for relation r succeeds iff some output condition Or on X, Z 
and output Y holds. Formally, this gives rise to the following open specification 
of r: 

VA : SX . Vr : SY . VZ : SZ . (Cond) 

ir(X, Z) ^ ( r(A, y, Z) ^ Or(X, Y, Z) ) 

The open symbols are the relations A, Or and the sorts SX, SY, SZ. Other forms 
of specification can also be handled. 

Mechanism. Schema- guided synthesis from a specification Sq is a tree con- 
struction process consisting of 5 steps, where the initial tree has just one node, 
namely Sq\ 

1. Choose a specification Si that has not been handled yet. 

2. Choose a program schema with parameters P, axioms A, constraints C, 
template T, and specifications S. 
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3. Infer a substitution 9i under which Si is an instance of the specification 
(available in S) of the defined relation in template T. This instantiates some 
(if not all) of the parameters P. 

4. Choose a substitution 6*2 that instantiates the remaining (if any) parameters 
in P, such that the constraints C hold (i.e., such that 6 *i U 02 b C) and such 
that one can reuse existing programs Pq for some (if not all) of the now fully 
instantiated specifications S' U 0i U 02 of the open relations in template T. 
Simplify the remaining (if any) specifications in S U 0i U 02 , yielding Sq- 

5. Add T U Pq — called the reused program — to the node with Si and add 
the elements of Sq to the unhandled specifications, as children of Si. 

These steps are iterated until all specifications have been handled; the overall re- 
sult program Pq for So is then assembled by conjoining, at each node, the reused 
programs. If any of these steps fails, synthesis backtracks to its last choice point. 
Schema-guided program synthesis is thus a recursive specification (problem) de- 
composition process followed by a recursive program (solution) composition pro- 
cess. 

The TZ£US£ schema can be chosen at Step 2; it forces the reuse at Step 4 of 
a program for q, because q is its only open relation. Every schema leads to some 
reuse at Step 4; for instance, T>C results in the reuse of a program for dec. 

Heuristics. Many choice points reside in schema-guided synthesis, so heuristics 
are needed to make good decisions, possibly by looking ahead into the synthesis. 

Some heuristics can be applied when designing a schema. For instance, a 
synthesis strategy is the choice at Step 4 of the open relations for which programs 
are reused. All templates envisaged by us so far have only a few meaningful 
strategies, hence it is best to hardwire these. For instance, template DC has 
only two interesting strategies: when starting with dec, the divide-and-conquer 
schema is as above; when starting with comp, it would have to be reexpressed 
in terms of the input/output conditions of r and comp, giving rise to another 
schema, with the same template. 

Other heuristics can be expressed as applicability conditions. For instance, 
the question arises of what program schema to apply at Step 2. An implicit 
heuristic can be achieved by ordering the schemas; putting TZ£US£ first would 
enforce our emphasis on reuse. There also is the question of how to apply a chosen 
program schema at Step 3. For instance, with T>C, one of the formal parameters 
in the given specification Sr has to be the induction parameter, and another 
the result parameter. This can be done based on the sort information in Sr- 
only a parameter of an inductively defined sort can be the induction parameter. 
One can also augment specifications with mode information, because parameters 
declared to be ground at call-time are particularly good induction parameters 
[25]. 

Background Knowledge. Step 2 assumes a base of program schemas, captur- 
ing a range of program classes. Also, Step 4 relies on a base of reusable programs. 
For instance, for the T>C schema, a base of specifications and programs for dec 
programs and ^ well-founded relations needs to be available. 
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Human Interaction. Schema-guided synthesis can be fully automated, as 
demonstrated with Cypress [65], Kids [66], DesignWare [67], and PlanWare 
[15]. However, interactive synthesis is preferable, with the human programmer 
taking the creative, high-level, heuristic design decisions, and the synthesizer 
doing the more clerical work. The design issues are intelligible to humans because 
the very objective of program schemas is to capture recognized, useful, human- 
designed programming strategies and program classes. 

Tool Support. An implementation of schema-guided synthesis can be made on 
top of any existing proof planner, exploiting the fact that program schemas can 
be seen as proof methods [35]. This provides support for the necessary higher- 
order matching and discharging of proof obligations. 

Scalability. The search space of schema-guided synthesis is much smaller than 
for deductive synthesis. First, schema-guided synthesis by definition bottoms 
out in reuse, both of the template itself and of existing programs. One can 
significantly reduce the number of reuse queries by applying heuristics detecting 
that an ad hoc program can be trivially built from the specification. Second, 
the proof obligations of Steps 3 and 4 are quite lightweight. Schema-guided 
synthesis thus scales up to real-life synthesis tasks, especially if coupled with 
a powerful program optimization workbench and sufficient domain knowledge. 
For instance. Smith [67] has successfully deployed his tools on real-life problems, 
such as transportation scheduling. 

4.3 Example 

Let us synthesize a program from the following specification, open in sort ST : 
VL : list{ST) . VJ : nat . VE : ST . true 

{ atpOs(^I/^ I ^ (^Satpos^ 

^ 3P, S : list{ST) . append{P, E ■ S,L) A length{P, /)) 

The first iteration of synthesis proceeds as follows. At Step 1, the specification 
Satpos is chosen because it is the only unhandled specification. At Step 2, suppose 
schema VC is chosen, after a failed attempt to apply schema TZ£US£. At Step 3, 
the specification Satpos is inferred to be an instance of Sr, when atpos{L,I,E) 
is seen as r{I,L,E), under the substitution 

(SX,SY,SZ) = {nat, list{SJ), ST) 

ir{X,Z) ^ true ,, , 

Or{X,Y, Z) ^ 3P , S \ list{ST) . append {P , Z ■ S ,Y) 

Alength{P , X) 

So far, 5 of the 9 parameters of VC have been instantiated. At Step 4, suppose 
the following substitution is chosen: 



SH = nat A A B ^ B = s(A) 

Uec{X, Z)^^X = Q OdeciX, Z,H,T)^X= s(T) 
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This instantiates the remaining 4 parameters of T>C in a way that the constraints 
Cl, C 2 , C 3 hold and that the program Pdec = {dec{X, Z,H,T) ^ X = s{T)} can 
be reused to meet the now fully instantiated specification Sdec- The specifications 
of the remaining open relations in template DC are now also fully instantiated: 

true ( min{X, Z) ^^X = 0 ) {Smin) 

true A ^^X = 0 ^ 

{solve{X, Y, Z) ^ 3P, S . append{P, Z ■ S,Y) A length{P, Af)) (Sgoive) 

X = s(T) A 3P, S . append{P, Z ■ S ,V) A length{P ,T) 

( comp{H, Z, V, Y) ^ 3P', S' . append {P' , Z ■ S', Y) (Scomp) 

Alength{P' , X) ) 

They can be simplified into the following specifications: 

min{X,Z)^X = Q {S'^,^) 

X = 0 ^ ( solve{X, Y, Z)^3S : list{ST) .Y = Z-S) {Koive) 

X = s{T)A 3P,S. append{P,Z ■ S,V) a length{P,T) ^ (<S'Lmp) 

{comp{H, Z,V,Y) ^ 3F : ST . Y = F ■ V) 

At Step 5, the program DC U Pdec becomes the reused program for Satpos, 
while S'^^^, and S'^^^p are added to the now empty list of unhandled 

specifications. 

The next iterations of synthesis proceed as follows. When >5''^;^^, and 

S'^omp chosen, suppose applications of some suitable variants of TZSUSS 
succeed through the ad hoc building of the programs Pmin = {min{X, Z) ^ X = 
0}, Psoive = {solve{X,Y,Z) ^ Y = Z ■ S}, and Pcomp = {comp{F[, Z,V,Y) ^ 
Y = F -V}. Since no new specifications were created, the synthesis is completed 
and has discovered the substitution (j)i. For call-mode atpos{+,—,+), say, the 
corresponding logic program 

atpos{L,I,E) ^ I = 0, L = E-S 

atpos{L, I, E) ^ = 0, I = s{T), atpos{V,T,E), L = E-V 

can be implemented [25], say by the Mercury compiler [68], into the following 
steadfast program: 



atpos{E ■ S, 0, E) ^ 
atpos{F ■ V, s(T), E) <— atpos{V, T, E) 

The comp operator had to be moved in front of the recursive call to achieve this. 
(Prolog cannot do this, so mode-specific implementation is left as a manual task 
to the Prolog programmer.) 

This example illustrated a relatively simple use of the T>C schema. In [31], 
a quicksort program is synthesized, using a variant of the divide-and-conquer 
schema T>C with two recursive calls. 
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4.4 Analysis 

Schema-guided synthesis captures recognized, useful, human-designed program- 
ming strategies and program classes in program schemas. In doing so, it takes 
the hardest proof obligations offline, preventing their repeated proof across var- 
ious syntheses and making reuse of existing programs the central mechanism for 
synthesizing programs. In the presence of powerful program optimization tools 
and sufficient domain knowledge, it thus naturally scales up, without any limita- 
tions on specification forms or program forms, due to the modular nature of the 
various forms of background knowledge. Heuristic guidance issues are still best 
tackled by humans, so schema-guided synthesis is best carried out interactively. 

A unified view of schema-guided synthesis and proof planning has been pro- 
posed [35], revealing potential new aspects of program schemas, such as appli- 
cability conditions capturing heuristics, as well as the possibility of formulating 
program schemas as proof methods and thereby reusing an existing proof plan- 
ner as a homogeneous implementation platform for both the schema applications 
and the proof obligations of schema-guided synthesis. 

Our future work includes redoing the constraint abduction process for more 
general divide-and-conquer templates, where some nonMinimal{X, Z) is not 
necessarily -^min{X,Z), and crafting the corresponding strategies, in order to 
allow the synthesis of a larger class of programs. Other design methodologies 
need to be captured in logic programming schemas; for instance, a global search 
schema has been proposed for the synthesis of constraint logic programs [37] . 

5 Inductive Synthesis 

Following a brief introduction to inductive generalization, we present a particular 
approach to induction of recursive logic program called compositional inductive 
synthesis, which is described in detail in [46]. 

5.1 Background 

The inductive approach to program synthesis originates in inductive logic. In- 
ductive logic is concerned with the construction of logical theories T explaining 
available observations or events. This means that, given evidence in the form 
of atomic formulas oi, 02 , . . . , a*, the logical induction approach is to devise an 
appropriate logical theory T so that 

T h tti A 02 A . . . A ttg. 

A major concern is to constrain T so as to rule out trivial solutions, such as 
T being inconsistent (thus supporting any evidence), or T being identical to the 
conjunction of available evidence. In the more traditional application of logical 
theories of induction in artificial intelligence, the quest is for a theory T taking 
the form of general rules, e.g., scientific rules, supporting the given evidence. In 
the context of induction of logic programs addressed here, the “observations” are 
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intended sample program input-output results in the form of atomic formulas, 
and the theory T is to be a definite clause logic program. Thus the consistency of 
T is guaranteed, but computational properties such as termination and compu- 
tational tractability of the synthesized program have to be separately considered. 

So the goal of inductive logic programming (ILP) is to obtain a collection of 
clauses with universally quantified variables, which subsumes the given finite list 
of intended program results. The main approach to achieve this goal is syntactic 
generalization of the given examples. Consider atoms p{a, a-b-nil) and p{b, b-nil). 
These two unit clauses generalize to the clause program p(X, X-Y) ^ . This rests 
on the existence of a dual of the most general unifier of two atoms known as the 
least general generalization (LGG) [63,62]. In this simple case, the LGG yields 
the intended program as a unit clause witness, p{X, X ■ Y) \- p{a, a • b ■ nil) A 
p{b, b • nil). 

The syntactical generalization of terms has been extended to a notion of 
generalized subsumption of clauses [18,63] and further to a method known as 
inverse resolution, see e.g., [58]. This method has proven useful for concept 
formation, deductive databases and data mining. However, it is too weak for 
induction of recursive logic programs. Consider examples of list concatenation, 
e.g., p{nil, a ■ nil, a ■ nil) and p{a • nil, b • nil, a ■ b • nil). The least general gen- 
eralization yields the clause p{X,Y ■ nil, a ■ Z) , which fails to capture the 
recursive definition of concatenation. Providing more examples eventually leads 
to an overly general clause: the universal predicate p{X,Y, Z), which subsumes 
all concatenation examples though it blatantly fails to capture concatenation of 
lists. A general remedy for over-generalization is to include negative examples, 
which are understood as examples in the complement set of the intended result 
set of atoms. In general, the key problem in synthesizing such programs is the 
invention and introduction of appropriate recursive forms of clauses. 

Compositional inductive synthesis employs a compositional logical language 
for computing relations in analogy to functional programming languages in- 
tended for composing and computing functions. The method does not apply the 
above generalization mechanisms. A program takes the form of a variable-free 
predicate expression p encompassing elementary predicates and operators for 
combining relations and producing new resulting relations. 

Let Lp \- e mean that the tuple (of terms) e is deducible from the program 
predicate expression p. The computational semantics of the language can then 
be explained by means of inference rules of the form 

(^1 h Cl ... h e„ 

op{ipi,...,pn)'<-e 

where e depends on op and ei, . . . , e„, as explicated in the concrete rules below. 
Let p \- e\ + . . . + €n mean p\- Ci for i = l..n, so that -I- combines result tuples. 
Thus, Lp \- ei + €2 + . . . expresses that the tuples of the term form . . . , tn) 

are computable from the n-ary predicate expression ip. 
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In the language Combilog employed here, the given elementary predicates 
are constant formation, identity and list construction defined by the inference 
rules: 

constc b (c) id h (t, t) cons h (h, t,h -t) 

In addition to the elementary predicates, there is a collection of operators, 
which map argument relations to relations. The three fundamental operators are 
here defined by: 



ip h (ti, t2j ■ ■ ■ ) ^n) 

(make) 

(pi h e + e' (p2 ^ e + e" ifi ^ ci ip2 b 62 

(and) (or) 

and(ipi,ip2) e (^2) b Ci + 62 

The make operator is a generalized unary projection operator carrying an aux- 
iliary vector of indices pi, . ■ . ,/Xm serving to reorder arguments and introduce 
don’t cares. As described in [46], Combilog possesses a compositional semantics 
in which and is set intersection and or is set union, which motivates the inference 
rules for the and and or operators. These operators reflect, respectively, logical 
conjunctions in clause bodies and multiple defining clauses. 

This operator language becomes as expressive as ordinary clause programs 
if the language is extended with facilities for naming predicate expressions and 
using these names recursively in program predicate definitions. However, in the 
present form the language does not introduce predicate names in a program. 
Instead, the defined predicates are anonymous and in order to accommodate 
recursive formulations e.g., for list processing, the iteration operators foldr and 
foldl are introduced. These operators are akin to the fold operators in functional 
programming and with theoretical underpinning in the theory of primitive re- 
cursive functions as discussed in [45,46], The associated rules are: 



■0 b 

foldr (ip, fi) b (ti,nil,tf) 



(foldr 0) 



foldr(ip,ijj) \- (ti,t 2 ,z) ip\-{h,z,t 3 ) 



foldr(ip,ip) b (ti,h-t 2 ,t 3 ) 

0 I" {tl,t3) 

foldl(ip,'ip) b (t\,nil,t 3 ) 



(foldr > 0) 



(foldl 0) 



if\-{h,ti,z) foldl(if,i;)\- (z,t2,t3) 

(foldl > 0) 

foldl(ip,tf) b (ti,h-t 2 ,t 3 ) 

For instance, with foldr available, the well-known append concatenation predi- 
cate is make 2 ,i, 3 (foldr(cons, id)), where the make operator swaps the two first 
arguments. 
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Below we illustrate the application of the rules using the append program, 
proving make 2 ,i, 3 {foldr{cons, id)) h (a • nil, b ■ nil, a - b • nil): 
id\- {b ■ nil, b ■ nil) 

{foldr 0) 

foldr{cons, id) h (6 • nil, nil, b ■ nil) cons h {a,b ■ nil, a ■ b ■ nil) 

{foldr > 0) 

foldr{cons, id) h {h ■ nil, a ■ nil, a ■ b ■ nil) 

(make) 

make 2 ,i, 3 {foldr{cons, id)) h (a • nil, b • nil, a • b • nil) 

When the inference rules are used to compute result tuples, these tuples are 
unknown parameters to be determined in the course of the execution. In con- 
trast, in the compositional inductive synthesis method, the result tuples are 
given initially, as a contribution to the result, whereas (pi, . . . ,(pn are (partly) 
unknown program constituents to be determined recursively in the course of the 
synthesis. These inference rules are used in the way described in Section 3.1 
for building proofs in a goal directed manner where the program constructs are 
unknowns, given as metavariables, and instantiated during proof. This facili- 
tates understanding of the induction process as a stepwise, principled, program 
composition process. 

5.2 Overview 

Let us now present compositional inductive synthesis in terms of its generic 
features. 

Specifications. In inductive synthesis, specifications are partial extensional 
definitions of the programs to be induced, i.e., a set of atoms or tuples consti- 
tuting sample program results. No other problem specific specifications need be 
employed. 

Mechanism. The operators are similar to schemas in the schema guided ap- 
proach to synthesis. In the present method, the program is synthesized in a 
strict recursive divide-and-conquer process by tentatively selecting an operator 
and then recursively attempting synthesis of constituent parameter programs. 

Our synthesis takes advantage of the metainterpreter outlined below for com- 
positional programs and does not rely on generalization mechanisms. The ap- 
proach can be characterized as the top-down stepwise composition and special- 
ization of a COMBILOG program intended as a solution in the sense that the pro- 
gram subsumes the program examples. The search involved in choosing between 
operators is taken care of by the back-tracking mechanism in the synthesizer. 

In principle, our synthesis proceeds by introducing meta- variables for the left 
operand predicate expressions of h in the proof construction, and then succes- 
sively instantiating these variables in the course of the goal-driven proof con- 
struction; in doing so, we also appeal to the rule 

p\- e\ p\- €2 
ip\- ei+ €2 
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which is used for goal splitting on the program examples. Thus the above proof 
may be conceived of as a trace of a sample inductive synthesis proof. 

In our metainterpreter system, the relationship h e is realized as a binary 
predicate syn, which simultaneously serves as metainterpreter and synthesizer. 
The key principle of our synthesis method is the inverted use of our metainter- 
preter so that the first argument program predicate is to be instantiated in the 
course of synthesizing a program. 

Thus the heart of the synthesizer is clauses of the following, general, divide- 
and-conquer form for the available operators: 

syn{comb{Pi, . . . , Pm), Ex) ^ apply -Comb {Ex , Ex \, . . . , Exm) 

A syn{Pi,Exi) A . . . A syn{Pm,Exra)- 

Programs consisting of an elementary predicate are trivially synthesized without 
recursive invocation of syn. Let us consider the synthesis of a basic predicate 
expression for the head predicate yielding the head of a non-empty list, given 
say the two examples {a ■ b • nil, a) and (a • nil, a). Synthesis of head is initiated 
with a goal clause 



^ syn{P, [[a, 6], a]) A syn{P, [[a], a]). 

A successful proof instantiates P with the synthesized expression makez^i{cons). 

Heuristics. A detailed description of the synthesizer is found in [46]. To pre- 
vent the synthesizer from running astray in the infinite space of possible pro- 
gram hypotheses, the search is conducted as an iterative deepening. To avoid 
unwanted trivial program solutions, further constraints are imposed on the syn- 
thesizer. Consider, for instance, synthesis of the append predicate. An overly 
general solution is obtained as the universal predicate, say, with the expression 
make 2 , 3 , 4 {constc) corresponding to the clause p{Xi, X 2 , X^). As mentioned, such 
unwanted solutions might be ruled out by the use of negative examples. How- 
ever in our synthesizer we have chosen to enforce well-modedness constraints 
on the synthesized programs thus suppressing the above solution in favor of the 
recursive 



P = make 2 ,i,z{foldr{cons, id)), 

which is obtained as the syntactically smallest solution given the two sample 
results {nil, nil, nil) and (a • nil, b ■ nil, a-b- nil) and the mode pattern [-I-, -I-, — ], 
and complying with the usual clauses for append. The synthesis proceeds as a 
goal-driven proof construction of the sample proof shown in the above section. 

Background Knowledge. The elementary predicates and the operators de- 
termine the admissible forms of programs and thereby constitute a form of back- 
ground knowledge. No problem-specific background knowledge is provided but 
a search bias may be imposed by providing additional auxiliary predicates. 
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Tool Support. For synthesizing the at-pos program, a system called COM- 
BINDUCE was used, which is based on the method outlined above and described 
in detail in [46]. 

Human Interaction and Scalability. The current experimental system 
conducts the inductive synthesis automatically. The computational search costs 
limit the size of inducible programs to around 6 predicates and operators. 

However, we envisage integration of the CombiInduce principles into a semi- 
automatic compositional development system. In this system, the programmer 
can offer assistance by proposing appropriate auxiliary predicates within the 
pertinent data type. The imposition of data types will also serve to constrain 
further the search space of well-moded program candidates. Recursion (fold) 
over lists will be generalized to other data types later. 

5.3 Example 

Since at this stage, the synthesis system supports list as the only data type we 
represent the number n as a list of length n with constants i, where i can be any 
constant. Synthesis of the atpos program from the single sample {a-b-nil, i-nil, b) 
yields the solution 

atpos = foldl{make4^3^2{cons), make^^i{cons))) 

as illustrated by the following trace: 

makes i(cons) h (b-nil,b) 

^ ^ — ifoldl 0) 

make4^s,2{cons) h foldl{make4^s,2{cons),makesp{cons)) h 

{-,a-b-nil,b-nil) {b • nil , nil , b) 

[foldl > 0) 

foldl{make4^s,2{cons) ^ makes^i{cons)) h {a ■ b ■ nil, i ■ nil, b) 

The synthesized program is the Combilog form of the definite clause program 

atpos{L, I, E) ^ syn{foldl{tail' , head), [L, I, E]) 
syn{tail' , [_, F ■ T, T]) ^ 
syn{head, [F ■ T, Fj) ^ 

Synthesis with the foldr operator is not possible. However, swapping the 
two subgoals of foldr yields the operator foldrrev allowing the following variant 
program to be synthesized 

atpos = makes, 2,i{foldrrev {cons, makei^s{cons))). 

The relationship between such a pair of variant programs is theoretically 
established by a duality theorem stated and proved in [44]. 

In order to facilitate the comparison of the synthesis approaches, let us trans- 
form the first Combilog form of the atpos definite clause program into a recur- 
sive atpos program. We first unfold the atpos clause: 

atpos {L, nil, E) ^ head{L, E) 

atpos{L, X ■ T, E) ^ tail' {X, L,Z),syn{foldl {tail' , head), [Z,T,E]) 
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Now, unfolding head and tail, and folding back the second literal with atpos, we 
obtain the following logic program. 

atpos {L, nil, E) ^ L = E ■ T 
atpos {L, X-T,E)^L = E- Z, atpos {Z, T, E) 



5.4 Analysis 

Check that meaning is preserved! Designing a metainterpreter for Combilog 
is simplified by the variable-free form of Combilog programs, the separation 
of predicate expressions and terms in separate arguments, and the elimination 
of introduced predicate names. These simplifications substantially reduce search 
and allow us to effectively use the metainterpreter as the backbone of our ILP 
method by reversing the provability metalogic programming demo predicate as 
examined e.g., in [43] and in [21] for ordinary definite clauses. 

In [46] we compare with other inductive synthesis systems and report results 
on successful automatic synthesis of a number of textbook programs including 
non-naive as well as naive reversal of lists. The latter program makes calls for 
the auxiliary predicate append, which is recursively induced. This predicate in- 
vention, which is generally considered problematic in ILP, is handled smoothly 
in our compositional method since explicit predicate names are not introduced. 

The outlined compositional method facilitates a program development 
methodology where customized domain specific operators are added to the gen- 
eral purpose ones. Moreover, it seems that the compositional method surpasses 
more traditional ILP methods with respect to predicate invention and termi- 
nation of induced programs within the considered class of primitive recursive 
relations delineated by the available recursive operators. 

6 Comparison 

In this section, the synthesis approaches are compared from different points of 
view. First, we compare the synthesized atpos programs. Afterwards, we con- 
trast the general features of the different approaches. Finally, we conclude by 
analyzing how schemas are used, implicitly or explicitly, in program synthesis 
and we suggest that they play a central role in understanding different synthesis 
methods. In the following, we will refer to inductive synthesis, deductive syn- 
thesis, and schema-guided synthesis to denote the particular synthesis methods 
presented in this paper. 



6.1 The atpos(L,I,E) Program 

All three methods yielded the same program. This was the case even though they 
differ in which variable they choose as an induction parameter: both inductive 
synthesis and schema-guided synthesis choose / as the induction parameter, while 
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deductive synthesis chooses L. In the case of deductive synthesis, we could just 
as well have carried out induction on I . However, for schema- guided synthesis, 
switching would require a separate schema with a different template, namely with 
an additional non-recursive clause for the non-minimal case. The same holds for 
inductive synthesis where a fold combinator over numbers and an associated rule 
would be required. 

In general, the choice of the induction parameter will affect the form of 
the resulting program and even its complexity [25]. In this regard, deductive 
synthesis offers more flexibility, as one can perform induction over any well- 
founded relation, and development (hence program construction) proceeds in 
smaller steps. Of course, in schema-guided synthesis and inductive synthesis, one 
can always introduce new schemas, respectively operators, corresponding to new 
ways of building programs, as the need arises. 

6.2 Specification 

The forms of the specifications in deductive synthesis and schema-guided syn- 
thesis are similar. Both are first-order formulas asserting a possibly conditional 
equivalence. In inductive synthesis, the specification is a finite set of examples 
(a subset of the extensional definition of the relation), which is by nature in- 
complete (when the extensional definition is infinite) . Specifications in inductive 
synthesis may also include negative examples or properties [28,36], but in general 
they remain incomplete. This incompleteness is a significant difference and, as 
we will see, it has far-reaching consequences. Indeed, it will play a key role in 
differentiating inductive synthesis from the other two approaches with respect 
to the other generic features. 

For the deductive synthesis and schema-guided synthesis approaches, in con- 
trast to inductive synthesis, it is important for non-trivial applications to be 
able to construct complex specifications and this requires ways of parameteriz- 
ing and combining specifications. In our work on deductive synthesis, we achieve 
this, in practice, by using logical frameworks like Isabelle [59], which provide 
support for structured theory presentations. In schema-guided synthesis, [33] 
express program schemas as extensions of specification frameworks [52], which 
support parameterized specifications and their composition. 

Of course, the use of first-order logic as a specification language has its limita- 
tions. For example, in schema-guided synthesis, we needed the well-foundedness 
of a relation ^ as a constraint in the T>C schema. However, a formalization of 
well-foundedness generally falls outside of first-order logic, unless one formal- 
izes, e.g., set-theory. A work-around is to assume that some fixed collection of 
relations is declared to be well-founded. The alternative is to use a stronger 
(higher-order) logic or theory [1] where concepts such as well-foundedness can 
be defined and well-founded relations can be constructed. Stronger logics, of 
course, have their own drawbacks; in particular it is more difficult to automate 
deduction. 
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6.3 Mechanism 

As presented, the mechanisms used in the three methods appear quite dissimi- 
lar. Deductive synthesis is oriented around derivations, schema-guided synthesis 
was described using an algorithm for applying schemas, and inductive synthe- 
sis uses a meta-interpreter to build programs. Yet it is possible to recast all 
three so that the central mechanism is the same: a top-down application of 

rules is used to incrementally construct a program, during a derivation, in a 
correctness-preserving way. In deductive synthesis, derived rules are applied top- 
down, using higher-order unification to build programs as a “side-effect” of proof 
construction. Although the mechanism for applying schemas has been presented 
in an algorithmic fashion, it is possible to recast schema-guided synthesis as the 
application of rules in a deductive system [1]; namely, a schema constitutes a 
(derivable) rule whose premises are given by the schema’s constraints and (the 
completion of its) template and the conclusion is given by the schema’s speci- 
fications. Viewed in this way, schema-guided synthesis, like deductive synthesis, 
constructs programs, during proofs, by the higher-order application of rules. The 
main distinction between the two methods boils down to the rules, granularity 
of steps, and heuristics/interaction for constructing proofs. Finally, in inductive 
synthesis, rules are also given for constructing COMBILOG programs. There, the 
rules are automatically applied by a Prolog meta-interpreter. 

Although they differ in form, the rules employed by the different methods 
have a similar nature. Not surprisingly, in all cases, mathematical induction 
plays a key role in program synthesis, as it is necessary for constructing itera- 
tive or recursive programs. In deductive synthesis, induction principles can be 
derived from induction principles for data types or even the inductive (least- 
fixedpoint) semantics of logic programs [1] . The induction principles (perhaps in 
a reformulated form, e.g., the ind rule of Section 3.2) are then explicitly applied 
and their application constructs a template for a recursive program. In schema- 
guided synthesis, the correctness of schemas for synthesizing recursive programs 
is also justified by inductive arguments. Indeed, complex schemas can be seen as 
kinds of complex macro-development steps that precompile many micro steps, 
including induction. One might say that induction is implicitly applied when us- 
ing a schema to construct recursive programs. In inductive synthesis, programs 
are iterative, instead of recursive, and programs that iterate over lists (or, more 
generally, other inductively defined data types) are built using fold rules. Again, 
mathematical induction principles play a role, behind-the-scenes, in justifying 
the correctness of iteration rules, and rule application can be seen as an implicit 
use of induction. There is, of course, a tradeoff. By compiling induction into spe- 
cialized rules, schema-guided synthesis and inductive synthesis can take larger 
steps than deductive synthesis-, however, they are more specialized. In particular, 
by building only iterative programs, the inductive synthesis method presented 
can sharply reduce the search space, but at the price of limited expressibility. 

The underlying mechanisms are, in some respects, fundamentally different. 
Although all three methods are based on first-order logic, any system imple- 
menting deductive synthesis (respectively schema-guided synthesis) will require 
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higher-order unification (respectively higher-order matching). This is necessary 
to construct substitution instances for variables in rules and schemas that range 
over functions, relations, and more generally, contexts (terms with holes); the 
downside is that higher-order matching and unification are more difficult than 
their first-order counterparts, and the existence of multiple unifiers (respectively 
matchers) can lead to large branching points in the synthesis search space. The 
operator form of Combilog means that rules in inductive synthesis manipulate 
only first-order terms. Moreover, all complications concerning object language 
variables are eliminated. This simplifies the metainterpreter and reduces the 
synthesis to search in the space of operator combinations subjected to well- 
modedness constraints. 

Finally, the differing nature of the specifications, in particular, complete ver- 
sus incomplete information, makes a substantial difference in the underlying se- 
mantics of the different methods and the relationship of the synthesized program 
to its specification. As presented here, both deductive synthesis and schema- 
guided synthesis construct programs that are (possibly under conditions) equiv- 
alent to some initial specification. In the case of inductive synthesis, equivalence 
is weakened to implication or entailment. This changes, of course, the semantics 
of the rules. Moreover it has a significant impact on extra-logical considera- 
tions, i.e., considerations that are not formalized in the synthesis logic (e.g., 
the program synthesized should have a particular syntactic form or complexity) . 
In inductive synthesis these considerations (in particular, having a syntactically 
small recursive program that entails the examples) become central to the syn- 
thesis process and it is important to use a well-specified strategy, embodied in a 
metainterpreter, to ensure them. 

6.4 Heuristics 

Each of the methods presented has an infinite search space. However, the spaces 
are differently structured and different heuristics may be employed in searching 
them. 

In deductive synthesis, one proceeds in a top-down fashion, employing in- 
duction and simplification. The search space has both infinite branching points 
associated with the application of higher-order unification (as there may be in- 
finitely many unifiers) and branches of unbounded length (as induction may 
be applied infinitely often and simplification may not necessarily terminate). 
In practice, an effective heuristic is to follow an induction step by eager sim- 
plification; here, rippling can be used to control the simplification process and 
guarantee its termination. Moreover, with the exception of applying induction, 
unification problems are usually of a restricted form, involving “second-order 
patterns,” which can be easily solved [51]. Hence, it is possible, in some cases, 
to use heuristics to reduce the search space to the point where synthesis can be 
completely automated. 

Schema-guided synthesis uses a strict recursive divide-and-conquer strategy 
in the selection of operators and the synthesis of the parameter programs. It also 
employs a stepwise composition/specialization of programs where the objective is 
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to reuse existing code. Analogous to deductive synthesis, critical branch-points 
include schema selection and selection of a substitution (higher-order match- 
ing is required as the same schema can be used in different ways). Search can 
be conducted as an iterative deepening search employing heuristics. Although 
schema-guided synthesis also has an infinite search space, it is fair to say that 
when a program is in the search space, one is likely to find it more quickly than 
with deductive synthesis since the steps in schema-guided synthesis are larger, 
and hence the program is at a shallower ply in the search tree. 

The search space in inductive synthesis is more difficult to navigate than 
in the other two methods because of the additional extra-logical concerns men- 
tioned previously. Here a strict control (dictated by a metainterpreter) is required 
to generate candidate programs in a particular order. To make automated search 
practical, the search space is restricted, a priori, by restrictions in the method. 
For example, the programs synthesizable are restricted to those involving itera- 
tion, instead of general recursion, and the use of combinators ensures that first- 
order (Prolog) unification suffices for program construction. In addition there is 
the well-modedness requirement and, to reduce explosive branching, the use of 
or is restricted. It is an interesting question as to whether any of these pruning 
measures could be profitably used in the other approaches. 

6.5 Background Knowledge 

The three approaches formalize background knowledge in different ways. For de- 
ductive synthesis, background knowledge about data types is given by a standard 
first-order theory augmented with appropriately reformulated (for synthesis) in- 
duction schemas (e.g., ind). For schema-guided synthesis, background knowledge 
must be formalized in terms of a base of program schemas, capturing a range 
of program classes, which may (or may not) directly incorporate information 
about data types, as well as a database of reusable programs and information 
about well-founded relations (typically associated with data types). Here, more 
work is usually required to formalize background knowledge, but the payoff is 
that this work is done once and for all and the resulting schemas can be used 
to reduce search and guide development to specialized classes of programs. For 
inductive synthesis, the background knowledge is basically the elementary oper- 
ators {const, id, cons, etc.), which encode knowledge about iterative programs 
operating over lists. As with the other approaches, this knowledge is domain- 
dependent, and synthesizing programs operating over other data types would 
require additional rules. 

6.6 Human Interaction and Scalability 

The deductive synthesis proof presented was constructed interactively. There, 
within a first-order formalization of list theory, specialized rules for synthesis 
were derived, and interactively applied. However, proof search can also be auto- 
mated using tactics and one can adjust the size of proof steps by deriving new 
proof rules (analogous to complex program schemas). This process of writing 
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tactics and deriving new rules is open, leads to a customizable approach, and 
can, at least in theory, scale arbitrarily. The use of tactics also makes it possible 
to arbitrarily mix automation with human interaction. 

Conversely, the schema-guided synthesis method was presented as fully au- 
tomatable, although a human could be used to drive the selection of schemas 
and substitution instances. Indeed, as with deductive synthesis, this is often 
preferable, as it provides a way of influencing extra-logical concerns, such as the 
complexity of the synthesized program. The approach scales well as specialized 
schemas can be tuned to particular classes of problems (divide and conquer, 
global search, etc.). Moreover, there is a natural mechanism for the reuse of 
programs. 

For the moment, there is no human interaction in the presented method for 
inductive synthesis. It is not clear either how feasible this is, given the impor- 
tance that extra-logical concerns play in the synthesis process. How would a 
human know, for example, that steps suggested will generate the simplest pos- 
sible program? The reuse of existing programs also is not handled. 

It is not clear how the inductive synthesis approach can be scaled up to 
synthesize more complex programs with recursion or iteration. For complex ex- 
amples, the incomplete nature of the input specification makes the program 
space so intractable that human interaction, heuristics, support for reuse, and 
“more complete” specification information, such as properties [30,28], appear 
necessary. But even with these extensions, the purely inductive approach to the 
synthesis of programs with recursion or iteration remains very hard, and it seems 
doubtful whether this approach will ever scale up to the synthesis of complex, 
real-life programs. 

When the synthesized program does not feature recursion or iteration (and 
methods for this are outside the scope of this paper) then the inductive synthesis 
approach can usefully scale. This is witnessed by recent progress in ILP, on 
problems in domains, such as face recognition [54], where only (large) sets of 
input/output examples are available as humans have difficulty writing a formal, 
complete specification [34]. 

6.7 Tool Support 

For deductive synthesis, we used Isabelle [59], a generic logical framework, for 
our implementation. For schema-guided synthesis, the higher-order proof plan- 
ning system XClam can be used, upon reformulation of the program schemas as 
proof planning methods [35]; this has the nice side-effect that the proof obliga- 
tions of schema-guided synthesis can also be discharged using the same theorem 
proving machinery. For inductive synthesis, a specialized Prolog implementation 
was used. 

It is interesting to speculate on whether generic logical frameworks, like Is- 
abelle, could be effectively used for all three approaches. And could the ap- 
proaches even be profitably combined? 

Our discussion at the top of Section 6.3 suggests that a generic logical frame- 
work can effectively be used for schema-guided synthesis. Of course, there are 
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some potential drawbacks. First, a logical framework requires recasting any syn- 
thesis method as one based on theorem proving; for instance, schema-guided 
synthesis was not cast this way in Section 4. This may require some contortions; 
see [9] for an example of this. Second, the logical framework will impose its own 
discipline for presenting and structuring theories, and this may deviate from 
that desired by a particular synthesis method; e.g., specification frameworks 
[52] provide more structuring possibilities than those possible using the Isabelle 
system. Finally, a hand-coded synthesis system will probably be more efficient. 
Although it is easy to write a Prolog interpreter (to realize inductive synthesis) 
as a tactic in a logical framework, this involves a layer of metainterpretation and 
a corresponding slow-down in execution time. The price may be too high when 
substantial search is involved. 

As to the question whether the approaches could be profitably combined, the 
answer is a clear ‘yes’ for deductive synthesis and schema-guided synthesis, and 
we will develop this point in the next sub-section. Combining inductive synthesis 
with the other approaches raises the question of how to deal with the ensuing 
redundancy in the overall specification, as the incomplete part supposedly is a 
logical consequence of the complete one. To a human programmer, examples 
attached to a specification that is intended to be complete often facilitate the 
understanding of the task. But an automated synthesizer probably does not need 
such help. Should there be a contradiction between the complete specification 
and the examples, then the overall specification is almost certainly wrong. In 
the absence of such a contradiction, one knows nothing about the quality of the 
overall specification and thus has to forge ahead. The question then arises of 
how to exploit the redundancy. A convincing proposal was made by Minton [57]: 
to cope with the instance sensitivity of the heuristics used to efficiently solve 
ubiquitous, NP-hard, constraint satisfaction problems, industry-strength solver 
synthesizers should use training instances (i.e., the input parts of examples) in 
addition to the specification of the problem, so that the most suitable heuristics 
can be empirically determined during synthesis. As long as the actual runs of 
the synthesised program are on instances within the distribution of the training 
instances, a good performance can be guaranteed. 

6.8 Implicit versus Explicit Use of Schema 

A central part of our comparison has been that the boundaries between deductive 
synthesis, schema-guided synthesis, and inductive synthesis are somewhat fluid 
with respect to the use of schemas. In particular, from the appropriate view- 
point, the difference between deductive synthesis and schema-guided synthesis is 
vanishingly small. We would like to close the comparison by driving these points 
home. 

The derived rules in deductive synthesis for reasoning about equivalences are 
rule schemas, i.e., rules with metavariables ranging over predicates. These are 
metavariables from the view of a metalogic, but they also can be viewed as 
uninterpreted relations in the object logic and play the same role as the open 
relation symbols in schema-guided synthesis. Viewed this way, if the background 
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theory of deductive synthesis is formalized as a specification framework, then 
the inference rules are a variation of the program schemas in schema-guided 
synthesis. 

For example, the ind rule with its assumptions A 1 -A 3 presented here in de- 
ductive synthesis is similar (although not equivalent) to the VC schema developed 
in schema-guided synthesis. In particular: 

— ind commits to an induction parameter of type list, whereas VC has an open 
sort SX for the induction parameter; 

~ ind commits to one-step, head-tail decomposition of the induction parame- 
ter, whereas VC has an open relation dec for this; 

— VC commits to always one recursive call in the step case, whereas ind is 
flexible (there can be any number of recursive calls); 

— the assumption Ai of ind plays the same role as the template DC in VC, 
but they differ in content; 

— the predicate variable B of ind plays the same role as the open relation solve 
in VC', 

— the assumption A 2 of ind plays the same role as the specification Ssoive in 
VC', 

— the predicate variable S of ind does not play the same role as the open 
relation comp in VC', indeed, an instance of S may include recursive call(s), 
whereas recursion is dictated by the template DC and is thus not considered 
when instantiating comp; 

— the assumption A 3 of ind plays the same role as the specification Scomp in 
VC, but they differ in content; 

— there is no explicit equivalent of the constraints C\, C 2 , and C 3 and the 
specifications Smin and Sdec of VC in ind. 

The differences here are not due to the underlying synthesis mechanism, but 
are an artifact of the particular implicit schema used (for reasons of simplicity) 
in this presentation of deductive synthesis. More elaborate rules and schemas, 
neither committed to a particular type nor a well-founded relation, have been 
developed in deductive synthesis, as presented in, e.g., [1,3]. 

A similar comparison can be made between the foldr and foldl operators in 
inductive synthesis, and the VC schema in schema-guided synthesis. The foldr 
and foldl operators can also be seen as implicit program schemas. More elaborate 
rules could also be used to build Combilog programs in larger steps. 

Program schemas are thus used (implicitly or explicitly) in the different 
synthesis approaches. In the literature, program schemas are often reduced to 
templates, formalized as higher-order expressions, and applied using higher- 
order unification. As shown in schema-guided synthesis, such templates must 
be enhanced with semantic information, expressed for instance through axioms, 
constraints, and specifications. Viewing such schemas as derivation rules, and 
schema application as logical inference, the distinction vanishes between the 
schema-guided and deductive/constructive approaches. For instance, in [1] it 
is shown how schemas for transformational development can be formalized as 
derived rules and combined with other kinds of verification and synthesis. In 
[30,28], a VC-\i\^e schema is used in the context of inductive synthesis. 
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7 Conclusion 

In this paper, we have analyzed and compared representative methods of three 
approaches to program synthesis in computational logic. Despite their differ- 
ences, we established strong similarities. In particular, program schemas are used 
(implicitly or explicitly) in each of the methods and are central in driving the 
synthesis process and exploiting synergies. We would therefore like to conclude 
by discussing some limitations of schemas and open issues. 

Despite their central role, schemas have their limitations. Schemas are usu- 
ally expressed in some logical language, but any given language has syntactical 
restrictions that in turn restrict what can be expressed as a schema. For example, 
a first-order language fixes the arity of predicates and functions, their associated 
types, etc. There is no way to capture certain simple kinds of generalization or 
extra-logical annotations, for example to employ term or atom ellipses ti, . . . , 
of variable length n. As an example of this limitation, consider the ind rule of 
Section 3.2. There we used X to denote a sequence of zero or more variables 
and hence the induction rule given cannot be captured by a single schema, but 
rather requires a family of schemas, one for each n. Extensions here are possible; 
[64,28,70,39,20] provide notions of schema patterns that describe such families 
and can be specialized as needed before, or during, synthesis. 

Schemas are here defined as abstractions of classes of programs. At the same 
time, they formalize particular design strategies, such as divide- and- conquer or 
global search; part of the associated strategy can also be specified by associated 
tactics, which choose induction parameters, find appropriate well-founded rela- 
tions, and so on. However, in their present form, schemas cannot handle more 
sophisticated design strategies, namely strategies abstracting a class of programs 
that cannot be obtained by instantiation with formulae. Typical examples are 
so-called design patterns [38], which aim at the description of software design 
solutions and architectures (typically described by UML diagrams and text). 
How to extend schemas to handle such strategies is an open problem in program 
synthesis. 

Overall, by examining the relationships and differences between the chosen 
synthesis methods, we have sought to bring out synergies and possibilities for 
cross-fertilization, as well as limitations. The primary synergies involve a com- 
mon mechanism: a notion of schematic rule and the use of unification to ap- 
ply rules in a top-down way that incrementally construct a program, during a 
derivation that demonstrates its correctness. The primary differences concern 
the nature of the specifications, in particular the information present; this also 
manifests itself in different semantics and radically different search spaces for 
the different methods. As it is, the purely inductive approach to the synthesis 
of programs with recursion or iteration remains very hard, and it seems doubt- 
ful whether this approach will ever scale up to the synthesis of complex, real-life 
programs. Fortunately, fruitful combinations of these synthesis approaches exist. 

In the end, we believe that progress in this field will be based on exploiting the 
identified synergies and possibilities for cross-fertilization, as well as supporting 
an enhanced, flexible use of schemas. We hope, with this paper, to have made a 
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constructive analysis of the last decade of research, thereby showing a possible 

path for the next decade. 
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Abstract. In this paper we demonstrate a refinement calculus for logic 
programs, which is a framework for developing logic programs from spec- 
ifications. The paper is written in a tutorial-style, using a running ex- 
ample to illustrate how the refinement calculus is used to develop logic 
programs. The paper also presents an overview of some of the advanced 
features of the calculus, including the introduction of higher-order pro- 
cedures and the refinement of abstract data types. 



1 Introduction 

The aim of this paper is to present an overview of a refinement calculus for 
logic programs. The calculus provides a framework for the stepwise refinement 
of logic programs from specifications. As with other refinement calculi, such as 
the imperative refinement calculus of Back [2], we make use of a wide-spectrum 
programming language that includes both specification constructs and a subset 
that corresponds to executable code. This allows one to transform a specification 
to code within a single notational framework. The specification constructs in- 
clude a specification command that allows the effect of a program to be specified 
in terms of a general predicate, and an assumption command that defines the 
range of values for which a program is expected to work. A semantics for the 
refinement calculus has been given which models commands (both specifications 
and code) as partial functions from sets of bindings of program variables to sub- 
sets of those bindings [12]. A tool has been developed to support the refinement 
calculus [15], based on the Isabelle/HOL theorem prover. 

To enhance the expressive power of the language, it has been augmented 
with both higher-order procedures [7] and a module mechanism with local (ab- 
stract) data types [6]. Higher-order procedures allow generic procedures to be 
written that apply a parameter procedure in a systematic manner. For example, 
the procedure map relates two (equal-length) lists of values by relating their 
corresponding elements according to a procedure given as a parameter to map. 
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Modules allow an (abstract) data type to be associated with a set of pro- 
cedures for manipulating values of that type. Programs can then be developed 
using higher-level data types that may be specified in terms of mathematical 
structures, such as (multi-)sets and relations, that may not be directly avail- 
able in the implementation language. By suitably restricting the structure of 
programs using such a module, the module may be replaced by a module of a 
similar structure that uses an implementable or a more efficient representation 
of the data type. 

We give an overview of the refinement calculus by presenting the refinement 
of a program, applyinst, that applies instantiations to a meta-expression to give 
an expression. The example is derived from an algorithm for adapting reusable 
library components in a software development system [13]. 

In Sect. 2 we introduce the wide-spectrum language. In Sect. 3 we give a 
specification of the applyinst procedure. In Sect. 4 we introduce the notion of 
refinement, present some refinement laws, and begin the refinement of applyinst. 
In Sect. 5 we further refine applyinst by introducing higher-order procedure calls. 
The refinement is completed in Sect. 6, where we replace an abstract specification 
type with an implementation type. Sect. 7 discusses aspects of the refinement 
calculus project that distinguish it from other logic program derivations schemes. 

2 The Wide-Spectrum Language 

In our wide-spectrum language we can write both specifications and executable 
programs. This has the benefit of allowing stepwise refinement within a single 
notational framework. 

A program in our language is a collection of parameterised procedures. Each 
procedure has a body which is a command whose only free variables are the 
parameters of the procedure. As well as commands that correspond to program- 
ming language constructs, the wide-spectrum language contains two commands 
that are not necessarily executable: the specification command, that constrains 
its free variables; and the assumption command, that can be used to define the 
context in which a command is required to work correctly. Commands may also 
be formed by using disjunction, parallel conjunction, sequential conjunction, ex- 
istential and universal quantification, and recursion. 

A specification command is of the form (P), where P is formula of predicate 
logic. The specification {X = 1) may be understood as binding A to 1 in states 
where X is unbound; it succeeds in states where X is bound to 1, and fails if 
X is bound to something other that 1. In our semantics we model the meaning 
of specification command, and any command in general, as a function from sets 
of bindings to sets of bindings, where a binding maps variables to values [12]. A 
binding represents an answer, providing a (single) value for every free variable. 
A variable X is unbound in a set of bindings, or state, if X is mapped to every 
possible value by the bindings. Using this functional meaning for commands, 
the behaviour of a command S is to constrain the set of answers to only those 
that satisfy S; alternatively, S eliminates those answers that do not satisfy S. 
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The meaning function, e, of any command in our language satisfies the property 
e(s) C s for a set of bindings s; this models the constraining nature of logic 
programs (command execution cannot decrease “groundedness”). We provide 
further examples below. 

The following is a definition of the procedure foo, that constrains its param- 
eter X to be either 0 or 1, and Y to be one greater than X . 

/oo = (A X, T : N • ((X = 0 V X = 1) A T = X -b 1)) 

The name foo is defined (=) as a procedure whose parameters are natural num- 
bers X and Y , and whose body is a specification command that constrains the 
possible values for X and Y . During the refinement process high-level procedure 
specifications are broken down into components that can be executed directly 
in an implementation language such as Prolog or Mercury [24]. This typically 
involves turning logical connectives in the specification into corresponding con- 
nectives in the programming language. For example, foo may be implemented 
as the procedure: 

(AX, y : N . ((X = 0) V (X = 1)), (T = X -b 1)) (2.1) 

Here we have replaced the logical connectives ‘A’ and ‘V’ by the corresponding 
program connectives and ‘V’. We use the symbols ‘A’ and ‘V’ for the logi- 
cal operators conjunction and disjunction as well as the program operators for 
parallel conjunction and disjunction, respectively. Similarly we use the symbols 
‘3’ and ‘V’ for both logical and program quantification. This does not lead to 
confusion within programs because the logical operators and quantifications can 
only appear inside specification and assumption commands. A summary of the 
operators and quantifiers of the language is shown in Fig. 1. We use S and T to 
stand for commands and X to stand for program variables. 



Example 



V 


disjunction 


SV T 


A 


parallel conj. 


SAT 




sequential conj. 


S, T 


3 


existential quant. 


(3X : Z • S) 


V 


universal quant. 


(VX : Z • S) 



Fig. 1. Summary of operators in the wide-spectrum language 



The meaning function of a disjunction 5 V T constrains the set of answers to 
those that satisfy either S' or T; similarly, the meaning function of a conjunction 
Sat restricts the set of answers to those that satisfy both S and T. For 
instance, (X = 0) V (X = 1) constrains the set of answers to those that either 
bind X to 0 or 1. The program (X = 0) A (X = 1) constrains the set of 
answers to those that bind X to both 0 and 1, i.e., the empty set. A command 
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that returns an empty set of answers acts as Prolog’s fail command - it is 
equivalent to (false). 

The meaning function of a sequential conjunction (S, T) is more interesting. 
It imposes an ordering on the execution; semantically, the answers satisfying S 
are passed as the input to the meaning function of T. Hence, T may assume that 
S is satisfied before it executes. For instance, in (2.1), the program connective 
ensures that X is bound before the equality involving Y . This ordering allows 
the final equality to be implemented using the ‘is’ built-in of Prolog. 

Our wide-spectrum language has an executable subset we refer to as code. 
A procedure is code if it has a straightforward translation into a logic program- 
ming language such as Prolog or Mercury [24]. This means the procedure uses 
the operators sequential and parallel conjunction, disjunction, and existential 
quantification, and may be recursive. Furthermore, the specification commands 
must contain predicates that have counterparts in the implementation language, 
e.g., equality, and procedure calls are only allowed on procedures that have also 
been refined to code. The procedure foo (2.1) satisfies these constraints - the 
corresponding Prolog syntax for the procedure foo is: 

foo(X, Y) (X = 0; X = 1), Y is X + 1. 

When defining procedures we often require some properties of its parameters. 
For instance, the following procedure member has an assumption command that 
its second parameter, L, is bound to a list of natural numbers; this is represented 
by the assumption command {L G listfN)}. It also has a specification command 
that constrains its first parameter, E, to be an element in the range of (set of 
elements in) L, i.e., in our notation E G ran(L). 

member A (A A : N, L : fot(N) • {L G &t(N)}, (E G ran(L))) (2.2) 

We make no assumption about whether E is bound or unbound. If E is bound, 
the procedure checks whether E is an element of L, failing if it is not. If E is 
unbound, it becomes bound to each element of L. In this paper we use “bound” 
to refer to a variable for which we have an assumption (command) that it is 
bound to a value of its type. Semantically, the meaning function of an assump- 
tion {A} is a partial function that is defined for only those states that satisfy A. 
An assumption does not constrain the set of answers. Hence, the behaviour of 
the command {L G list(N)} is undefined for states in which L is not bound to a 
list of natural numbers, and does not restrict the set of answers if L is bound ap- 
propriately. The worst possible program in our language is {false}, which we call 
abort. Its behaviour is undefined for any input: it may do anything, including 
not terminating, halting abnormally, failing (returning an empty answer set), or 
succeeding with arbitrary answers. 

As another example of the use of assumption commands, consider the fol- 
lowing procedure: 



divide = (A A, F, Z : N • {A G N A F G N A F yf 0}, (Z = A div F)) 
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This procedure assumes that the variables X and V are bound to natural num- 
bers, and that V is non-zero, and establishes the relation that Z is the integer 
quotient of the division of X by T. It is not required to do anything when Y is 
zero. Assumptions are often needed to justify refinement steps, for example to 
ensure that the primitive predicates will be executed correctly when translated 
into the implementation language. 

We may implement the specification of member (2.2) as a recursive procedure. 
A natural number A is a member of a nonempty list [H \ T] if either E = H or 
A is a member of T. 

IJ, mem • (A A : N, L : list{N) • {L G fot(N)}, 

{3H :N,T : list{N) • {L = [H \ T]), ((A = F) V mem{E, T)))) 

The notation (/i mem • body) defines mem to be the least fixed point solution 
for mem of the (recursive) equation mem = body. A least fixed point always 
exists for our recursive programs [9], though for non-terminating recursions the 
fixed point is the worst possible program, abort. The refinement rule for in- 
troducing recursion prevents us from deriving non-terminating recursions in our 
refinements [12]. 

3 An Example: Applying Instantiations 

In many computing applications it is necessary to define a mechanism for system- 
atically replacing occurrences of certain syntactic constructs by other constructs. 
For example, in macro languages such as T[;]X, M4, and the language defined by 
the C preprocessor, parameterised macros are replaced by structures in the ob- 
ject language. A similar process may also be used to obtain partial evaluations 
of logic (and other) programs: schematic variables are consistently instantiated 
by other expressions. 

In this section we present a specification of a program that performs such a 
replacement. This particular example is based on an algorithm used for adapt- 
ing reusable library components in the CARE language [13]. In CARE, library 
components can be parameterised over metavariables. Components are used by 
instantiating their metavariables by expressions. The CARE tool includes an 
algorithm, based on higher-order pattern matching, for finding an instantiation 
that maps the metavariables occurring in library components (the source) to 
their corresponding object expressions. To simplify the presentation, we have 
chosen the easier task of applying a given instantiation to a component (source) 
to obtain the object. 

Expressions (the results of applying instantiations) are constructed from vari- 
ables (with names taken from the given set VName) and functors (with names in 
FName) applied to lists of expressions. Constants are viewed as nullary functors. 

E G Expr 

(3 A : VName • E = var{X)) V (3.1) 

(3 F : FName, L : list{Expr) • E = fn{F, L)) 
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Meta-expressions are a generalization of expressions which may contain meta- 
variables applied to some parameters; we call such applications schemas. Meta- 
expressions are transformed by instantiating their metavariables to give an ex- 
pression. The constructors var and fn are as for expressions (except that the 
arguments of a function are themselves meta-expressions); in addition, there is 
a constructor for schemas, whose names are drawn from the given set MVar. 

M G MetaExpr 

(3X : VName • M = variX)) V 

(3.2) 

(3 F : FName, L : Ust{MetaFxpr) • M = fn{F, L)) V 

(3 V : MVar, L : Ust{MetaFxpr) • M = schema{ V, L)) 

Instantiations map occurrences of metavariables to patterns. Patterns may 
contain place holders of the form ph{i), where z is a natural number. Place 
holders give the position of the corresponding parameter in the schema argu- 
ments. Patterns are another generalization of expressions: as well as variables 
and functors, patterns may contain these placeholders. 

P € Pattern 

(3X : VName • P = var(X)) V 

(3.3) 

(3 F : FName, L : list{Pattern) • P = fn{F, L)) V 
(3 iV : N • P = ph{N)) 

Instantiations are thus partial functions (-h-) from metavariables to patterns: 

Inst = MVar -h- Pattern 

For example, let / be a binary function, p and q be metavariables, and g and 
h be nullary functions. Consider an instantiation I that maps p to g and q to h. 

i{p) =Hg,[]) 

I{q) =fn{h,[\) 

Applying I to the meta-expression f{p, q) results in the expression f{g, h). For 
readability purposes we use conventional notation to write (meta-)expressions, 
though formally the meta-expression /(p, q) and the expression /(p, h) are repre- 
sented hy fn{f ,[schema{p,[]),schema{q,[])]) and /n(/, [/n(p, [ ]),/n(/i, [ ])]), re- 
spectively. 

Instantiations may also map metavariables to patterns involving placeholders. 
For example, the instantiation I' below defines a metavariable p that accepts two 
parameters, denoted by place holders ph{l) and ph(2), and yields an expression 
which might be interpreted as the difference between the second and double the 
first: 



I' ip) = H'-',[ph{2)Jn{'V,[ph{l)Jn{2, [])])]) 
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Applying I' to the meta-expression p{a, b) results in the expression b — a*2. The 
expanded representation of the two expressions are schema{p, [fn{a, [ ]),fn{b, [ ])]) 
and/n('-', [fn{b,[])Jn{' x' , [fn {a, []),fn {2, [])])]), respectively. 

Elements of Inst are partial as they need not have a mapping for every 
metavariable. The range of Inst is restricted to patterns, which themselves con- 
tain no schemas, therefore we only need to consider one level of instantiation 
application. 

We now give three properties of a relation apply Inst for applying an instanti- 
ation to a meta-expression. For all I G Inst and Q G Expr: 

(VA : VName • Si’pp\y\nst{I ,var{X),var{X))) (3.4) 

(V E : FName,L : Ust{MetaExpr) • applylnst(/,/n(E, L), Q) 

(3 L' : list(Expr) • A 

(V i : l..#T • applylnst(/, L{i), L' (i))) A 

Q=fn{F,L'))) (3.5) 

(V V : MVar, L : Ust{MetaExpr) • applylnst(/, schema{ V, L), Q) <tA 
(3 L' : list{Expr) • = ^L' A 

(V i : l..#T • applylnst(/, L(z), L' {i))) A 

V € dom(/) A substph(L',/(E), Q))) (3.6) 

Property (3.4) states that applying an instantiation to a variable has no effect. 
Property (3.5) states that the result of applying I to fn{F,L) is fn{F , L'), where 
the length of L, #L, is the same as the length of L' , and L' is the result of applying 
I to each element of L. To determine the result of applying I to schema{ V , L), 
property (3.6), we again construct the list L' which is the result of applying I 
to the elements of L. We extract the definition of V from I, I(V), and use the 
relation substph to substitute place holders in I(V) with expressions from the 
parameters list L'. The result of this, Q, is the instantiation of schema{V , L) 

via I. Note that if V is not in the domain of I, applylnst(/, schema{ V, L), Q) is 

false. 

We define the result of substituting place holders with corresponding values 
from a list of expressions by introducing a relation substph. It is defined by the 
following three properties, one for each of the three forms of patterns. For all 
Params G list{Expr) and Out G Expr: 

(VA : VName • substph{Params,var{X),var{X))) (3.7) 

(VA : N» suhstph{Params,ph{N), Out) AA 

N G l..#Params A Out = Params{N)) (3.8) 

(V E : FName,L : list{Pattern) • substph(Params,fn(E,L), Out) AA 
(3 L' : list(Expr) • #E = ^L' A 

(V i : l..#T • substph(Earams, L{i), L'{i))) A 
Out =fn{F,L'))) 



(3.9) 




Developing Logic Programs from Specifications Using Stepwise Refinement 



73 



Property (3.7) states that substituting place holders has no effect on a variable. 
Property (3.8) replaces a place holder ph{N) with the A^th element from the list 
Params, provided is a valid index into Params. If the input is a functor (3.9), 
we recursively apply substph to each of its parameters to obtain a value for Out. 

We now specify our top-level program, apply Inst, in terms of the relation 
applylnst(/, M, Q). 

applyinst = (A / : Inst, M : MetaExpr, Q : Expr • 

{/ G Inst A M G MetaExpr}, (applylnst(/, M, Q))) 

Since we are applying an instantiation to a meta-expression, we make the as- 
sumption that the instantiation I and the meta-expression M are already bound 
to values of the appropriate types. Any program that calls applyinst must ensure 
that the assumption is satisfied. We refine applyinst in subsequent sections. 

4 Refinement 

Specifications are transformed into code via a sequence of correctness-preserving 
steps; this process is known as refinement. We say a command S is refined by a 
command T, written S' C T, if T terminates normally for all inputs for which 

5 terminates normally (with respect to its assumptions) and T computes the 
same set of answers as S whenever S terminates. Each step in a refinement 
is justified by the use of a refinement law, which has been proved correct with 
respect to the underlying semantics. Below we present some refinement laws, and 
then illustrate their use by beginning the refinement of the procedure applyinst 
from Sect. 3. 



4.1 Refinement Laws 

We present a selection of refinement laws below. Where a law is divided into 
two parts by a horizontal line, the part above the line is the proof obligation 
that must be satisfied for the refinement below the line to be valid. A predicate 
equivalence, P = Q, states that P and Q are equivalent for all possible values 
of their free variables. Similarly, P ^ Q states that P implies Q for all possible 
values of their free variables. The symbols and are the usual equivalence 
and implication of predicates, which may or may not be true for given values 
of their free variables. We use A, P and Q for predicates, and S and T for 
commands. 

Law 1 Weaken assumption Law 2 Equivalent specifications 

P^ Q P=Q 

{P}^{Q} (■P)E(Q) 

We can refine an assumption command by transforming its predicate under log- 
ical implication using Law 1. We can refine a specification command by trans- 
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forming its predicate under logical equivalence using Law 2. These laws corre- 
spond to weakening assumptions and maintaining the effect on free variables, 
respectively. 



Law 3 Assumption in context 

A ^ {P <5) 
{A},{P)Q{A},{Q) 



Law 4 Propagate assumption 



{P),Sn{P),{{P},S) 



Law 3 generalises Law 2, in that we may make use of the assumption predi- 
cate A in proving the equivalence of predicates P and Q. Assumptions may be 
propagated through sequential conjunction using Law 4. We use this law to pass 
contextual information around a program. 

Law 5 Parallel to sequential Law 6 Lift disjunction 



SAT\zs,T {p y Q) Q {p) y {Q) 

A parallel conjunction can be refined to a sequential conjunction using Law 5. 
The second component of a sequential conjunction, T, may assume properties 
established by the first component, S, using Law 4. Law 6 allows a predicate 
disjunction inside a specification command to be lifted to its corresponding wide- 
spectrum program operator. Similar laws hold for parallel conjunction and the 
quantifiers. 



Law 7 Monotonicity of parallel conjunction 

SQS' AT 
S A T S' A r 

Monotonicity laws state that the result of replacing a component of a program 
by its refinement refines the entire program. In this case, if S' refines S and T' 
refines T then the parallel conjunction S' A T' refines SAT. Monotonicity 
holds for all the operators and both quantifiers in the wide-spectrum language. 
We use monotonicity laws implicitly in refinements. 



4.2 Example: Initial Steps 

In this section we begin the refinement of the procedure applyinst from Sect. 3. 
The initial stages of the refinement presented below follow the structure of 
applyinst. However, care needs to be exercised when introducing recursion to 
ensure the resulting procedures terminate. Some parts of the refinement require 
additional techniques which are introduced in later sections. 

We begin with the specification as given in Sect. 3: 

applyinst = (A / : Inst, M : MetaExpr, Q : Expr • 

{/ G Inst A M G MetaExpr}, (applylnst(/, M, Q))) 



(4.1) 




Developing Logic Programs from Specifications Using Stepwise Refinement 



75 



Since the definition of applyinst is recursive, we develop a recursive imple- 
mentation of applyinst, using the principle of well-founded induction. Let S{X) 
be a specification involving a parameter X of type <t, ^ be a well-founded order 
on cr, and id be a fresh name. As is usual for a recursive procedure with param- 
eter X, when developing the code for the procedure we may assume that the 
procedure satisfies its specification for values smaller than X. That is, we assume 
the inductive hypothesis S{Y) C id{Y) for all T ^ A when refining 5'(A). If 
under that assumption we can refine b'(A) to P, then S Y p,id • (AA : a • P). 

For the applyinst example, the parameter A is the triple {I,M, Q), whose 
type cr is Inst x MetaExpr x Expr. The well-founded ordering {!' ,M', Q') ^ 
(/, M, Q) is satisfied when M' is a subexpression of M . Finally, we choose the 
name apply as our id. The inductive hypothesis is that for all P : Inst, M' : 
MetaExpr, Q' : Expr: 

{!' G Inst A M' G MetaExpr A M' A M}, (applylnst(/', M' , Q')) 

E (4.2) 

apply{I',M', Q') 

We can use the inductive hypothesis to introduce recursive calls to apply within 
procedure applyinst. We will then have refined applyinst to the recursive proce- 
dure /i apply • (A / : Inst, M : MetaExpr, Q : Expr «... apply {. ..)...). 

We begin the refinement of the body applyinst (4.1). Initially our goal is to 
manipulate the body so that recursive calls may be introduced using (4.2). Using 
Law 3 {assumption in context) with the assumption M G MetaExpr allows us 
to refine (applylnst(/, M, Q)) to the following. 

(M G MetaExpr A applylnst(/, M, Q)) 

The following proof obligation was required to apply Law 3: 

I G Inst A M G MetaExpr ^ 

(applylnst(/, M, Q) {M G MetaExpr A applylnst(/, M, Q))) 

Continuing with the refinement, we expand the predicate M G MetaExpr 
using (3.2), and distribute the resulting disjunction over applylnst(/, M, Q) using 
Law 2 {equivalent specifications) . We lift the resulting disjuncts using Law 6 {lift 
disjunction) and expand the scope of the quantifications and then lift them. In 
addition, we lift the conjunctions and refine them by sequential conjunctions 
using Law 5 {parallel to sequential) . The body of applyinst is now: 

(3 A : VName • 

{M = var{X)), (applylnst(/, M, Q))) V (4.3) 

(3F : FName,L : Ust{MetaExpr) • 

{M = fn{F, L)), (applylnst(/, M, Q))) V (4.4) 

(3 V : MVar,L : list {MetaExpr) • 

(M = schema{ V, L)), (applylnst(/, M, Q))) 



(4.5) 
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Note that for each of the three branches, the structural form of M, established 
by the first specification command in each branch, e.g., (M = var{X)), can be 
assumed when refining the second specification command, (applylnst(/, M, Q)). 
We now refine each branch in turn. The first branch (4.3), where M is a variable, 
may be refined by using Law 2 {equivalent specifications) with (3.4) on the second 
conjunct. The resulting code is: 

(3X : VName • (M = var{X)), {Q = var{X))) 

The second branch (4.4), where M is a function application, may be refined 
using (3.5). We lift the resulting conjunctions and quantifiers, giving: 

(3f : FName,L : Ust{MetaExpr) • {M = fn{F,L)), 

(3 L' : list{Expr) • {fi^L = A 

(V z : l..fi=L» (applylnst(/, L(z), L'(z)))) A 
iQ=fn{F,L')))) 

Noting the presence of the specification command (applylnst(/, L(z), L'(z))), we 
can introduce a recursive call using (4.2) provided we can establish the assump- 
tion {/ G Inst A L{i) G MetaExpr A L{i) A M}. We note that we are in 
a context in which I and M are assumed to be bound variables of type Inst 
and MetaExpr respectively. Since M is bound, and M = fn{F, L) is established 
earlier in a sequential conjunction, it follows that L must be bound also, and 
therefore each element of L is bound. Furthermore, L{i) A M holds since L{i) 
is a subexpression of L, which is a subexpression of M . We introduce the as- 
sumption {/ G Inst A L{i) G MetaExpr A L{i) A M} and propagate it into the 
second branch to syntactically match (part of) our program with the left-hand 
side of the refinement in the inductive hypothesis (4.2). 

(3F : FName,L : list {MetaExpr) • (M = fn{F,L)), 

(3 L' : list{Expr) • {fi^L = #L') A 
(Vi : l..#L. 

{/ G Inst A L{i) G MetaExpr A L{i) A M}, 
(applylnst(/,L(i),L'(i)))) A 
{Q=fn{F,L')))) 

We can now refine lines four and five to a recursive call, using (4.2). 

{3F : FName,L : list {MetaExpr) • {M =fn{F,L)), 

(3 L' : list{Expr) • (#L = #L') A ^ 

(Vi : l..#L* apply{I,L{i),L'{i))) A 
{Q=fn{F,L')))) 



The universal quantification will be eliminated in Sect. 5.1. 
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The third branch (4.5), where M is a schema, may be refined using (3.6). We 
lift the resulting conjunctions and quantifiers, giving: 

(3 V : MVar,L : Ust{MetaExpr) • {M = schema{V , L)) , 

(3 L' : list(Expr) • {^L = #L') A 

(y i : (applylnst(/, L(z), L'(z)))) A 

(U G dom(/)) A (substph(L', /( U), Q)))) 

We again introduce a recursive call within the universal quantification, after 
introducing the assumption {/ G Inst A L{i) G MetaExpr A L{i) A M} as in 
the refinement of (4.4). 

(3 V : MVar,L : Ust{MetaExpr) • {M = schema{V , L)) , 

(3 L' : list(Expr) • (#L = A 

(Vi : l..#L* apply{I,L{i),L'{i))) A 
{V G dom(/)) A (substph(L', /( U), Q)))) 

The universal quantification will be eliminated in Sect. 5.1. The last line includes 
the expression I{V), which is not directly executable. We show how to develop 
code for this situation in Sect. 6. First, in Sect. 4.3, we refine the last line to a 
call on a procedure that implements the relation substph. 



4.3 Example: Substituting Parameters for Place Holders 



We define a procedure that implements the relation substph under the assump- 
tion that its first two parameters are bound. Any program that calls substph, 
such as applyinst, must ensure the assumptions are satisfied. 

substph = (A Params : list(Expr), In : Pattern, Out : Expr • 

{Params G list (Expr) A In G Pattern}, 

{substph {Params, In, Out))) 

To implement the specification command (substph(L', /( U), Q)) from (4.7) 
as a procedure call substph{L' , I{V), Q), we must establish the assumption 
{E G list{Expr) A I{V) G Pattern}. L' G list{Expr) follows from the re- 
cursive calls apply {I , L{i) , L' {i)) . We refine the parallel conjunction involving 
apply {I , L{i) , L' {i)) to sequential conjunction (Law 5), and then use Law 4 
to establish L' G list{Expr) as an assumption before (substph(L', /( U), Q)). 
I{V) G Pattern follows from I G Inst and V G dom(/), therefore we similarly 
refine the parallel conjunction involving V G dom(/) to sequential conjunction 
and propagate the assumption I{V) G Pattern. The code for applyinst so far 
is: 
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applyinst Q 

pL apply • (A / : Inst, M : MetaExpr, Q : Expr • {I & Inst AM G MetaExpr}, 
{3X : VName • (M = var{X)),{Q = var{X))) V 
{3F : FName,L : list {MetaExpr) • (M = fn{F, L)), 

(3 L' : list{Expr) • (#L = #L') A 

(Vz : apply{I,L{i),L'{i))) A 

{Q=fn{F,L'))))\/ 

(3 V : MVar,L : Ust{MetaExpr) • {M = schema{V , L)), 

(3 L' : list{Expr) • (#L = A 

(Vz : apply{I,L{i),L'{i))), 

{ V G dom(/)), substph{L' ,I{V), Q)))) 

We refine the body of substph following a similar pattern to that of applyinst. 
We introduce a case analysis on the type of In, apply the properties (3.7), (3.8) 
and (3.9) as appropriate, and lift the predicate operators to their wide-spectrum 
counterparts. As with applyinst, we refine the conjunctions occurring in the 
pattern {In = ...) A ... by sequential conjunctions. This allows us to satisfy 
assumptions for the recursive calls that are introduced as part of the refinement. 

(3 A : VName • {In = var{X)), {Out = var{X))) V 

(3 A : N • {In = ph{N)), {N G l..#Params) A {Out = Params{N))) V 

{3 F : FName,L : list{Pattern) • {In =fn{F,L)), 

(3 L' : list{Expr) • (#L = #L') A 

(V z : • (substph(Paroms, L{i), L'{i)))) A 

{Out=fn{F,L')))) 

The first disjunct is already code. The second disjunct involves an array-like 
access of a list . We may refine this to a call on a recursive procedure that traverses 
the list and returns the Ath element, or fails if A is not a valid index. For brevity 
we omit the refinement and assume procedure elemi{L, I , E), that implements 
{I G ^..^L A E = L{I)) exists in our target implementation language. The 
refinements of similar list processing procedures are presented in [9] . In the third 
disjunct we introduce a recursive call in a similar manner as for applyinst. The 
universal quantification is eliminated in Sect. 5.1. Collecting the refinement of 
substph gives: 

substph C 

ytsub • {XParams : list{Expr) , In : Pattern, Out : Expr • 

{Params G list{Expr) A In G Pattern}, 

(3 A : VName • {In = var{X)), {Out = var{X))) V 
(3 A : N • {In = ph{N)), elemi{Params, A, Out)) V 
{3 F : FName,L : list{Pattern) • {In =fn{F,L)), 

(3 L' : list{Expr) • {=ffL = #A) A 

(V z : • sub{Params, L{i), L'{i))) A 

{Out=fn{F,U))))) 

To refine this program to code, we eliminate the universal quantifications in 
Sect. 5 and refine the last line of applyinst in Sect. 6.5. 
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5 Higher-Order Procedures 



In this section we continue the refinement of the applyinst example to illustrate 
the use of higher-order procedures. A higher-order procedure is one that takes 
a procedure as a parameter. For example, consider the following specification of 
the standard higher-order procedure map, which applies a procedure P to all 
the elements in a list L, returning the list L' . 

map = X P : a ^ T ^ Cmd, L : list{a), L' : list{T) • 

{L G list{a)}, (5.1) 

(#L = #L') A (V z : l..#L . P(L(z), L'{t))) 



The higher-order parameter, P, is a procedure that takes two parameters, of 
(generic) types cr and r, respectively, and provides a command (type Cmd) that 
defines the relation between these parameters. The map procedure then relates 
two equal length lists, L and L' , provided every element of L is related to the 
corresponding element of U by P. In [7] we show how map may be refined to 
recursive code. 

From the definition of map we may deduce the following refinement law. 



Law 8 Introduce map. For all L and L' of type list (a) and list{T), respectively, 
and all procedures P that take two parameters of type a and t, 



{L G list{a)}, 

(#L = #L') A (V z : l..#L . P(L(z), L'(z))) 



C map{P, L, L') 



5.1 Example: Introducing map 

Recall the second case of applyinst, where the input pattern is a functor (4.6): 

(3 F : FName, L : Ust{MetaExpr) • {M = fn{F, L)), 

(3 L' : list{Expr) • {ffL = ffL') A 

(Vz : l..#L* apply{I,L{i),L'{i))) A 
iQ=MF,U)))) 

Note that the second and third lines almost match the definition of map (5.1). 
From the assumption M G MetaExpr in applyinst we can introduce the assump- 
tion L G list {MetaExpr), which implies L is bound. 

(3 F : FName, L : list {MetaExpr) • (M = fn{F, L)), 

(3L' : list{Expr) • 

{L G Ust{MetaExpr)} , 

{#L = ffL') A (Vz : l..#L • apply{I, L{i), L'{i))) A 
{Q=fn{E,U)))) 

The third and fourth lines now match (5.1), except that apply takes three 
parameters instead of the two expected by map. To match fully with the defini- 
tion of map, we use a partial application of apply, apply{I). In our language all 
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procedures are curried (though for brevity of presentation we have not shown 
them as such). Hence, apply (I) is a function which takes two parameters, as 
required by the signature of map, and we may write apply {I , L{i) , U (i)) as 
apply{I){L{i),L' {i)). We use Law 8 {introduce map) with apply{I) as the first 
parameter to map, giving: 

(3f : FName,L : Ust{MetaExpr) • {M = fn{F,L)), 

(3 L' : list{Expr) • map {apply {!), L, L') A {Q = fn{F,L')))) 

The procedure apply with instantiation I is applied to each element of L, result- 
ing in the list L' . Given a target language that implements a map function and 
supports partial application of procedure calls, such as Mercury [24] , the above 
command can be translated to executable code. 

Using similar refinements to those above, we may replace the universal quan- 
tifications appearing elsewhere in apply Inst and substph by calls to map. Col- 
lecting the refinement of applyinst and substph gives: 

apply Inst Q 

pL apply • {XI : Inst, M : MetaExpr, Q : Expr • {I & Inst AM G MetaExpr}, 
{3X : VName • (M = var{X)),{Q = var{X))) V 
{3F : FName,L : list {MetaExpr) • (M = fn{F, L)), 

{3 L' : list{Expr) • map {apply {!), L, L') A {Q =fn{F,L')))) V 
(3 V : MVar,L : Ust{MetaExpr) • {M = schema{V , L)) , 

{3L' : list{Expr) • map {apply {I ), L, L'), 

( V € dom(/)), substph{L' ,I{V), Q)))) 

substph C 

ptsub • {XParams : list{Expr), In : Pattern, Out : Expr • 

{Farams G list{Expr) A In G Pattern}, 

(3 X : VName • {In = var{ V)), {Out = var{ V))) V 
(3 IV : N • {In = ph{N)), elemi{Params, N, Out)) V 
{3 F : FName,L : list{Pattern) • {In =fn{F,L)), 

(3 L' : list{Expr) • map{sub{Params) , L, L') A {Out = fn{F, L'))))) 

Only the last line of applyinst is not code; we present the refinement of this line 
in Sect. 6.5. 



6 Modular Logic Program Refinement 



In this section we outline a technique for module data refinement [6], where a 
program is refined by changing the type of some of its variables. We assume a 
type and operations on that type are encapsulated in a module. By making some 
assumptions about the way in which such modules are used, we can develop 
efficient implementations of abstract modules. We use module refinement to 
complete the refinement of the applyinst procedure. 
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6.1 Modules 

A module is a collection of procedures that operate on values of a given data 
type. We refer to variables of the given type as opaque. For instance, consider 
the module Abstractinst that operates on values of the abstract partial function 
type for Inst. 

Module Abstractinst 

Type Inst = MVar -h- Expr 

init = (A / : Inst • {I = 0)) 

lookup = (A / : Inst, K : MVar, V : Expr • 

{/ G Inst A A G MVar}, {{K, V) G I)) 
update = (A / : Inst, K : MVar, V : Expr, I' : Inst • . . .) 

End 

The procedure init establishes I as the empty function, 0, while lookup estab- 
lishes V as the value of /(A), or fails if A is not in the domain of I. The proce- 
dure update, the details of which we omit for brevity, may be used to construct 
a nonempty value I' of type Inst. The Abstractinst module could be generalised 
to implement a partial function with any types for the domain and range; for 
simplicity we use the above instance where the partial function is from MVar to 
Expr. 

For encapsulation purposes, a program that uses the abstract Inst type 
should make use of that type only through the procedures of the Abstractinst 
module. A program that uses Abstractinst must also respect its intended modes, 
which can be determined by looking at the assumptions for each procedure. If a 
parameter is assumed to be of the opaque type, that parameter is called an input 
to the procedure; if there is no such type assumption the parameter is called an 
output. 

Since partial functions are not directly implemented in most languages, we 
would like to replace all the references to the abstract module with references to 
a concrete module that faithfully implements the abstract procedures using an 
implementation language data type. 

6.2 Module Refinement 

In general we say a module Ai is module-refined by module Ai' under the fol- 
lowing condition: all programs P are refined by replacing calls to the procedures 
of the module Ai by calls to the corresponding procedures in the module A4'. 
While this definition is the most general, by restricting the class of programs P 
for which the module refinement must hold we can simplify some of the reason- 
ing. Furthermore, by assuming that calls to a module occur in a certain order 
(imposed by sequential conjunction), we can allow efficient representations to be 
used that would not be possible in the more general case. Consider the following 
program that uses the procedures from Abstractinst: 

(3 1 : Inst • init{I), ... ,{3 1' : Inst • update{I, X, F , 
lookup {I ' , A, F))) 
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There is a strict order on the calls to init, update, and lookup, though within the 
. . . there may be arbitrary commands that do not use the module or variables 
of its type. Suppose we have a module Concretelnst that is a module refinement 
of Abstractinst, providing procedures inif^ , lookup'^ and update'^ using an im- 
plementable type, Inst'^ , to represent the instantiation. Since Concretelnst is a 
module refinement of Abstractinst, we may refine the above program to 

(3 /+ : Inst^ • init^{I ^), . . . , (3 /'+ : Inst^ • update^ , X, Y , . . . , 

lookup^ {I ,K , R)) 

Note, however, that it is not the case that init C inif^ . Indeed, because they 
operate on different types {Inst and Insf^), one could not possibly refine the 
other since they provide different sets of answers for their parameters. 

To prove that a module AA is refined by a module AA' we use a coupling 
invariant, which is a relation between variables of the abstract and concrete type. 
Each pair of corresponding procedures from the modules are checked against 
the conditions for module refinement, given in [6], using the particular coupling 
invariant chosen. However, in many situations it is possible to automatically 
calculate a concrete module, given an abstract module and a coupling invariant. 
Using the calculation process to derive the concrete module guarantees that the 
conditions for module refinement will be met. In Sect. 6.3 we refine apply Inst 
to use a call on lookup from the Abstractinst module. In Sect. 6.4 we introduce 
module calculation, and in Sect. 6.5 we show how it may be applied to the lookup 
procedure. 

6.3 Example: Introducing lookup 

In the third case of apply Inst, where M is a schema, we need a refinement of 
the command: 

{V € dom{I)), substph{L' , I{V), Q) 

This is the only part of the applyinst program that makes direct use of the 
instantiation I. However we are not able to refine this directly to code because 
the expression I{V) is not directly implementable in most logic programming 
languages. Below we refine the above program fragment to make use of the 
procedure lookup from the Abstractinst module. 

We separate I ( V) from the use of its value (sometimes called flattening). We 
introduce an existential variable FDefn that has the value I{V), using Law 2 
( equivalent specifications) . 

{{3 FDefn : Expr • U G dom(/) A FDefn = I{V))),substph{L' ,I{V), Q) 

Treating the abstract partial function representation of an instantiation, I, as 
a set of pairs, we rewrite V G dom(/) A FDefn = I{V) as {V, FDefn) G I. 
Now we lift the existential quantifier, expand its scope to encompass the call to 
substph, and replace I{V) with FDefn in the call to substph. 

(3 FDefn : Expr • (( V , EDefn) G I), substph{L' , FDefn, Q)) 
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The command {{V , FDefn) S I) is refined by a call to lookup, since / G Inst 
and V G MVar are guaranteed by the context. 

(3 FDefn : Expr • lookup{I, V , FDefn), substph{L' , FDefn, Q)) (6-1) 

Thus the only non-trivial reference to the instantiation I in the applyinst pro- 
cedure occurs in a call on module Abstractinst. 



6.4 Module Calculation 

A technique for deriving, or calculating, a concrete module from an abstract 
module has been developed [9]. Consider an abstract procedure of the form 
(A / \ a, V : T • {A}, (P)), having no opaque output parameters, and in which 
V is not of the opaque type ( U is referred to as a regular parameter) . Given a 
coupling invariant CI{I,L) relating a variable I of the abstract type a with a 
variable L of the concrete type cr'’" , we may calculate the corresponding concrete 
procedure as: 

{XL ■. ,V : T • 

{{31 ■. a • CI{I,L) A A)}, (6.2) 

((V/:(T. CI{I,L) A A^ P))) 

The assumption may be understood as a constraint on L that there exists some 
abstract instantiation I which satisfies the abstract assumption A and to which 
L is related via the coupling invariant. Similarly, the specification command can 
be understood as specifying that, for all abstract instantiations I related to L 
and satisfying the assumption A, the abstract specification P must hold. Once 
a procedure has been calculated in the above form, the developer then simplifies 
the assumption and specification to eliminate references to the abstract type 
cr. In many cases, depending on the form of the coupling invariant, this can be 
done via applications of the one-point laws. In the next section we use the above 
result to calculate the concrete procedure for lookup. 

The calculation technique may also be applied to abstract procedures with 
opaque output parameters, but for brevity we do not present the general form 
of the corresponding concrete procedure here (see [9] for details) . 

6.5 Example: Calculation 

We can calculate the corresponding concrete procedure for lookup after choosing 
an appropriate concrete representation and relating it to the abstract type via 
a coupling invariant. We choose to concretely represent the partial function by 
a list whose elements are pairs of MVars and Exprs. We relate a variable I of 
the abstract (partial function) type with a variable L of the concrete (list) type 
using the coupling invariant I = ran(L). This coupling invariant states that the 
abstract instantiation I contains all of the pairs in the list L. The relationship 
is straightforward since a partial function can be thought of as a set of pairs. 
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the first elements of which form the domain of the function, with the second 
elements being the corresponding values for the members of the domain. Hence, 
the range of the list [(z, g), {y, h)] forms the set {{x, g), {y, h)}, which is a partial 
function which maps x to g and y to h. 

Using the general form of (6.2) with the coupling invariant I = ran(L), noting 
that the type a is Inst and K and V are regular variables, generates the concrete 
procedure lookup'^: 

lookup^ = (A L : list{MVar x Expr), K : MVar, V : Expr • 

{(3/ : Inst • I = ran(L) A I G Inst A K G MVar)}, 

((V I : Inst • I = ran(L) A I G Inst A K G MVar {K, V) G /))) 

This is a valid module refinement of lookup using a list of pairs to represent the 
abstract partial function. However, it is rather complex and not directly exe- 
cutable at this stage, since it still uses the abstract type (though such references 
are scoped by quantifications). 

We refine the procedure body to code. The assumption and specification 
commands may be simplified using the one-point rules for existential and uni- 
versal quantification, respectively, and the resulting redundant antecedent in the 
specification command may be removed using Law 3 {assumption in context), 
giving: 

|ran(L) G Inst A K G MVar}, {{K, V) G ran(L)) 

We simplify the assumption using Law 1 (weaken assumption) since 

ran(L) G Inst ^ L G list(MVar x Expr) 

However we must still refine the specification {{K, V) G ran(L)} to code. This 
is a membership test in the list L. We omit the details of the refinement for 
brevity, and assume that our target implementation language has an appropriate 
procedure member, similar to that presented in Sect. 2. 

After applying the calculation technique to the init and update procedures 
(each of which contains output parameters, and therefore require slightly differ- 
ent calculations to that of lookup [9]), we have the full concrete module. We use 
InsV^ as the name of the concrete type list(MVar x Expr). 

Module Concretelnst 

Type Inst~^ = list{MVar x Expr) 
init^ = (XL : Inst^ • {L = [])) 
lookup~^ = (XL \ Inst'^ , K : MVar, V : Expr • 

{L G Insf^ A K G MVar}, member {{K , V), L)) 
update~^ = . . . 

End 

Since we have followed the calculation process, we may refine a program that 
uses Abstractinst - provided the program satisfies the structural restrictions 
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discussed in Sect. 6.2 - by replacing each of its calls to procedures of module 
Abstractinst with calls to the corresponding procedures of module Concretelnst. 

Collecting the refinements from each section gives us the complete program. 
It uses the type Inst'^ and procedure lookup~^ from the Concretelnst module. 
We use the the symbol Ed to indicate that the refinement of applyinst is a 
data refinement, since we data refined the original instantiation type (partial 
function) to a list of pairs. 

applyinst Ed 

p. apply • (A / : Inst~^,M : MetaExpr, Q : Expr • {/ G Inst~^ AM G MetaExpr}, 
{3X : VName • (M = var{X)), {Q = var{X))) V 
{3F : FName,L : list {MetaExpr) • {M = fn{F, L)), 

{3 L' : list{Expr) • map {apply (I), L, L') A {Q =fn{F,L')))) V 
(3 V : MVar,L : Ust{MetaExpr) • (M = schema{V , L)) , 

{3 L' : list{Expr) • map{apply{I), L, L'), 

(3 FDefn : Expr • lookup^ {I, V, FDefn), 
substph{L' , FDefn, Q))))) 

substph E 

ytsub • {XParams : list{Expr) , In : Pattern, Out : Expr • 

{Params G list{Expr) A In € Pattern}, 

(3 X : VName • {In = var{ V)), {Out = var{ U))) V 
(3 IV : N • {In = ph{N)), elemi{Params, N, Out)) V 
{3 F : FName,L : list{Pattern) • {In =fn{F,L)), 

(3 L' : list{Expr) • map {sub {Params), L, L') A {Out = fn{F, L'))))) 

7 Conclusions 

In this paper we have presented a refinement calculus for logic programming, 
and illustrated how it can be used by developing a small, but non-trivial, logic 
program from its specification. Our refinement calculus is similar in style to de- 
ductive logic program synthesis (surveys of which can be found in [3, 11]). At 
the most fundamental level, logic program development is the manipulation of 
predicates from general logic to a subset that corresponds to code, and devel- 
oping a logic program in either the refinement calculus or synthesis style will 
require similar manipulation. We compare our approach to other logic program 
development schemes in the next section. 

A distinguishing feature of the refinement calculus approach is its rich spec- 
ification language. In particular, a program (fragment) has an associated as- 
sumption component, similar to the precondition component of a program spec- 
ification in an imperative programming formalism. This allows one to partially 
specify procedures, in the sense that their operation is not defined if the assump- 
tions do not hold. We make use of this when developing recursive procedures by 
requiring that recursive calls satisfy an assumption that their arguments are 
bound to values that are strictly less than those of the enclosing call according 
to some well-founded relation. For instance, if the member procedure given in 
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Sect. 2 is passed an unbound parameter for formal parameter L, then the tail 
of L, T, will also be unbound. This will result in infinite recursion. In the re- 
finement calculus framework, the conditions for introducing recursion require a 
well-founded ordering to be maintained. To satisfy this condition, the recursive 
parameter must be bound. In this paper we have introduced recursion somewhat 
informally, but a more formal approach based the use of a refinement law for 
introducing recursion may be found in [12]. 

The translation to actual logic program code is not as straightforward as for 
an imperative language. The translation is not just a matter of turning conjunc- 
tions into commas and using defined language primitives - the order of conjuncts 
in a procedure goal must also be considered. At the logic level, conjunction is 
commutative, and therefore does not provide any guide to ordering its conjuncts. 
Knowledge of the execution mechanism of the implementation language is re- 
quired to correctly order the conjuncts in a goal. In the calculus framework, 
assumptions and sequential conjunction partially bridge this gap. In Sect. 4.3 
we saw that parallel conjunctions needed to be refined to sequential conjunc- 
tions so that the assumptions of the second operand of the conjunction (which 
was refined to a procedure call to substph) were established by the first operand 
(these assumptions are required to ensure the recursion of substph terminates). 
This ordering is precisely that required in a real (Prolog) implementation to 
ensure termination. The order of remaining parallel conjunctions is irrelevant to 
the satisfaction of procedure call assumptions, and termination of recursion (the 
order may be relevant, however, to performance issues - in this case, knowledge 
of the execution mechanism is required). Related to the issue of ordering con- 
juncts (goals) is the ordering of disjuncts (clauses). Given our total-correctness 
requirement, recursive procedures developed using the recursion introduction 
refinement law will terminate regardless of the ordering of disjuncts, assuming 
that assumptions are met. For this reason, the wide-spectrum language does not 
have a sequential disjunction operator. 

The refinement calculus approach as described in [12] has been extended in 
several directions. One of these is data refinement [8, 6], where the type of a 
program variable is refined to some other type, usually for implementation pur- 
poses, as illustrated in Sect. 6. The specification language has been extended to 
include higher-order constructs [7]. The introduction of higher-order constructs 
simplifies some refinements by the use of powerful higher-order procedures, as il- 
lustrated in Sect. 5. The specification language has also been extended to include 
demonic non-determinism [14], although we did not make use of it in this paper. 
Demonic, or “don’t care” non-determinism allows one to choose between sets of 
possible answers that a program must return; normally an implementation must 
return exactly the same set of answers as the specification. The set of answers 
associated with a demonic choice between two programs S and T, written Sr\T, 
is either the set of answers that S returns or the set of answers T returns. This 
is in contrast to the set of answers associated with a disjunction S V T, which 
is the union of the set of answers for S and T. 
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A tool has been developed to support the refinement calculus [15], based on 
the Isabelle/HOL theorem prover. By using facilities provided by the theorem 
prover it is possible to automatically discharge many proof obligations associated 
with refinement law applications, though the user of the tool guides the refine- 
ment by selecting which rules to apply. A code generation tool has also been 
developed [5] . It takes output from the refinement tool and generates executable 
code for the Mercury language [24]. This involves the deduction of intended 
(Mercury) mode information from the assumptions a procedure makes about its 
parameters. The full semantics of the calculus and more details on some of the 
above topics can be found in [9] . 

7.1 Related Work 

Traditionally, the refinement calculus has been used to develop imperative pro- 
grams from specifications [1, 21, 22, 20]. The increase in expressive power of logic 
programming languages, when compared with imperative languages, leads to a 
reduced conceptual gap between a problem and its solution, which means that 
fewer development steps are required during refinement. An additional advan- 
tage of logic programming languages over procedural languages is their simpler, 
cleaner semantics, which leads to simpler proofs of the refinement steps. Finally, 
the higher expressive level of logic programming languages means that the indi- 
vidual refinement steps typically achieve more. 

There have been several proposals for the constructive development of logic 
programs, for example in Jacquet [17]. Much of this work has focused on program 
transformations or equivalence transformations from a first-order logic specifi- 
cation [4, 16]. Read and Kazmierczak [23] propose a stepwise development of 
modular logic programs from first-order specifications, based on three refine- 
ment steps that are much coarser than the refinement steps proposed in this 
paper. This leaves most of the work to be done in discharging the proof obliga- 
tions for the refinement steps, for which they provide little guidance. Another 
approach to constructing logic programs is through schemata [19]. A logic pro- 
gram is designed through the application of common algorithmic structures. The 
designer chooses which program structure is most suitable to a task based on 
the data types in question. As such, the focus of this method is to aid the design 
of large programs. The refinement steps and corresponding verification proofs 
are therefore much larger. 

Deductive logic program synthesis [3, 11] is probably the most similar to 
the refinement calculus approach. In deductive synthesis, a specification is suc- 
cessively transformed using synthesis laws proven in an underlying framework 
(typically first-order logic). As mentioned earlier, the main difference between 
most deductive synthesis approaches and logic program refinement is the in- 
clusion of assumptions in the wide-spectrum language, acting as preconditions. 
However, Lau and Ornaghi [18] have the concept of a conditional specification, 
which includes an input relation for a procedure (e.g., types, modes) with respect 
to which the synthesis of the procedure can take place. The refinement calcu- 
lus generalises this by allowing an assumption (input relation) for any arbitrary 
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program fragment. Another aspect of deductive synthesis is that the deduction 
rules are derived with the SLD computation rule in mind. Thus aspects of ter- 
mination, completeness etc., have to be dealt with during the synthesis process. 
The refinement approach leaves clause ordering and computational termination 
as part of the translation from wide-spectrum language to code. 

Deville [10] introduces a systematic program development method for Prolog 
that incorporates assumptions and types similar to ours. The main difference is 
that Deville’s approach to program development is mostly informal, whereas our 
approach is fully formal. A second distinction is that Deville’s approach concen- 
trates on the development of individual procedures. By using a wide-spectrum 
language, our approach blurs the distinction between a logic description and a 
logic program. For example, general predicates may appear anywhere within a 
program, and the refinement rules allow them to be transformed within that con- 
text. Similarly, programming language constructs may be used and transformed 
at any point. 
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Abstract. Most logic programming languages actually provide some 
kind of dynamic scheduling to increase the expressive power and to con- 
trol execution. Input consuming derivations have been introduced to de- 
scribe dynamic scheduling while abstracting from the technical details. 
In this paper we review and compare the different proposals given in [9], 
[10] and [12] for denotational semantics of programs with input consum- 
ing derivations. We also show how they can be applied to termination 
analysis. 



1 Introduction 

1.1 Dynamic Scheduling in Logic Programming 

In logic programming the selection rule determines which atom in a query is 
selected at each derivation step. The standard selection rule is the left-to-right 
one of Prolog, which is simple to implement, but which can cause problems both 
with termination and with negation when selected atoms are not fully instanti- 
ated. Moreover there are situations - like in the context of parallel executions 
or generate-and-test patterns - that require a more flexible control mechanism 
{dynamic scheduling) in which the atom to be selected is determined at runtime. 

Dynamic scheduling is achieved by using a dynamic selection rule and this 
increases the expressive power of the language and allows for a finer control of 
the execution. In practical systems, dynamic selection rules are implemented by 
means of constructs such as delay declarations (as in Godel [26] and ECLiPSe 
[27]) or block declarations (as in SICStus Prolog [28] - block declarations are 
actually a special kind of delay declarations). Alternatively, in concurrent logic 
languages such as GHC [43], programs are augmented with guards controlling 
the selection of atoms dynamically. For example Moded Flat GHG [45] uses 
conditions based on modes and instantiation constraints imposed on individual 
clauses. 

Delay declarations, advocated by van Emden and de Lucena [46], were intro- 
duced explicitly in logic programming by Naish [37,34]. By associating conditions 
to predicate symbols, delay declarations indicate when an atom can be selected 
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for resolution. Such conditions are based on instantiation: typical delay declara- 
tions are ground (X) or nonvar(X) which specify that the associated atom can 
be selected for evaluation only if its argument X is respectively a ground term or 
a non- variable term. Delay declarations can be also combined together by means 
of logical operators, allowing for more complex control. 

To see how delay declarations can enforce dynamic scheduling, consider the 
following programs APPEND and IN_DRDER: 

7, append(Xs,Ys,Zs) ^ Zs is the concatenation of the lists Xs and Ys 
appendC [H|Xs] ,Ys, [H|Zs] ) ^ append(Xs,Ys,Zs) . 
append ([],Ys,Ys). 

7, in_order (Tree, List) 4— List is an ordered list of the nodes of Tree 
in_order (tree (Label, Left, Right) ,Xs) <— 
in_order (Left ,Ls) , 
in_order (Right ,Rs) , 
append (Ls, [Label I Rs] ,Xs) . 
in_order (void, [] ) . 

together with the query 

(5 : read_tree (Tree) , in_order (Tree, List) , write_List (List). 

where read_tree and write_List are defined elsewhere. If read_tree cannot 
read the whole tree at once - say, it receives the input from a stream ~ it would 
be nice to be able to run the “processes” in_order and write_list on the 
available input. This can be done properly only if one uses a dynamic selection 
rule. Prolog’s rule would call in_order only after read_tree has finished, while 
other fixed rules would immediately diverge. For instance, the fixed rule that 
selects always the second atom in a clause body, and that selects the first one 
only when the body contains only one atom can lead to nontermination, as 
the query in_order (Tree, List) can easily diverge. The same applies to the 
rule that always selects the rightmost atom in a query, with the extra problem 
that write_list (List) would be called with a non-instantiated argument: if 
write_List is non-backtrackable (as many 10 predicates are) this would imply 
that this selection rule yields a wrong output. In the above program, in order 
to avoid nontermination one can declare that predicates in_order, append and 
write_List can be selected only if their first argument is not just a variable. 
Formally, 

delay in_order(T,_) until nonvar(T) . 
delay append (Ls ,_, _) until nonvar(Ls) . 
delay write_List (Ls , _) until nonvar(Ls). 

These declarations prevent in_order, append and write_List from being se- 
lected “too early”, i.e., when their arguments are not “sufficiently instantiated”. 
Note that instead of having interleaving “processes” , one can also select several 
atoms in parallel, as long as the delay declarations are respected. This approach 
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to parallelism has been first proposed by Naish [36] and - as observed by Apt 
and Luitjes [5] - “has an important advantage over the ones proposed in the 
literature in that it allows us to parallelize programs written in a large subset of 
Prolog by merely adding to them delay declarations, so without modifying the 
original program” . 

Compared to other mechanisms for user-defined control, e.g., using the cut 
operator in connection with built-in predicates that test for the instantiation of a 
variable (var or ground), delay declarations are more compatible with the declar- 
ative character of logic programming. Nevertheless, many important declarative 
properties that have been proven for logic programs do not apply to programs 
with delay declarations. This is mainly due to the fact that delay declarations 
can cause deadlock situations, in which no atom in the query respects its delay 
declaration and therefore no atom is selectable. Because of this the well-known 
equivalence between model-theoretic and operational semantics does not hold. 
As an example, consider the query append (X,Y,Z) with the execution mecha- 
nism described above: it does not succeed (it deadlocks) and this is in contrast 
with the fact that (infinitely many) instances of append (X,Y,Z) are contained 
in the least Her brand model of APPEND. 

1.2 Semantics of Logic Programs with Dynamic Scheduling 

By introducing dynamic scheduling we obtain more powerful and flexible pro- 
grams but we are faced with the problem of finding new techniques for ensuring 
correctness and termination of such programs and more generally for analyz- 
ing them. The standard semantics and properties are no longer valid when an 
atom can be delayed under some condition. In particular the standard semantics 
cannot capture the possibility of floundering when no atom in the goal can be 
selected. Hence it is not surprising that only relatively few proposals have been 
given for a semantics for logic programs with dynamic scheduling despite of their 
practical importance. 

The first proposal of an operational semantics for dynamic scheduling in the 
form of coroutining was given by Naish [35]. He defined SLDF resolution, which 
is a straightforward generalization of SLD resolution, where execution of atoms 
may be suspended indefinitely. He also considered termination of such programs 
and observed that if the set of callable atoms is closed under instantiation, the 
termination behaviour is more amenable to analysis. Moreover Naish stressed 
the importance of mode information for reasoning about termination of such 
programs. An operational semantics for constraint logic programs (CLP) with 
dynamic scheduling has been given also by Debray et al. [19]. 

Falaschi et al. [24,33,23] have defined a denotational semantics for CLP pro- 
grams with dynamic scheduling where the semantics of a query is given by a set 
of closure operators (each operator corresponds to a sequence of rule choices). 
They start from an operational semantics for constraint logic programs with dy- 
namic scheduling given in terms of derivations from the goals, which is similar 
to the one in [19] and in [32]. Then they give a semantics in terms of and-trees, 
which captures the structure of a derivation in a compositional way. An and-tree 
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can be seen as a function mapping an initial constraint to its answer. The deno- 
tation of a sequence of atoms is then a set of closure operators, corresponding to 
the and-trees which have this sequence as root. Their denotational semantics is 
the analogue of the bottom-up 5-semantics [13] for usual logic programs, where 
atoms are mapped to their set of answers. 

Such a denotational semantics can be used as a basis for the analysis of logic 
programs with dynamic scheduling, since closure operators can be abstracted by 
descriptions that capture their behaviour. This idea was followed by Marriott 
et al. in [32] where a framework for global dataflow analysis for logic program- 
ming languages with dynamic scheduling is developed. Its main use is to give 
information on calling patterns. In [17] the analysis is further improved both in 
precision and in efficiency. From such proposals also optimization techniques for 
logic programs with dynamic scheduling have been derived, such as in [38]. 

A very elegant definition of an algebraic and logical semantics for constraint 
logic languages with dynamic scheduling has been given by Marriott in [31]. It 
corresponds to an operational semantics based on the one given by Naish in [35] 
generalized to arbitrary constraints. Delayed atoms are considered as constraints 
and then the soundness and completeness results for success and finite failure 
for CLP are extended to CLP with dynamic scheduling. 

In spite of these proposals some problems remained open. Dynamic schedul- 
ing is often introduced to ensure the termination of the program, preventing pos- 
sible diverging derivations. Nevertheless, while for pure Prolog programs (i.e., 
logic programs employing the fixed leftmost selection rule) there exist results 
characterizing when a program is terminating such as in [7,18,14] no such a 
characterization was derived for programs with dynamic scheduling from these 
semantics. 



1.3 Semantics of Input Consuming Derivations 

In order to provide a characterization of dynamic scheduling that is reasonably 
abstract and amenable to termination analysis, Smaus introduced in [40] input 
consuming derivations. The definition of input consuming program relies on the 
concept of mode. A moded program is a program in which each atom’s argu- 
ments are partitioned into input and output ones. Output arguments are those 
produced by the atom during the computation process, while input arguments 
are consumed. Roughly speaking, in an input consuming program only atoms 
whose input arguments are not instantiated through the unification step are 
allowed to be selected. 

We believe that - in many cases - the adoption of “natural” delay declara- 
tions is equivalent to considering only input consuming derivations [1 1] . This is 
the case, for instance, of the programs mentioned in the example above together 
with their natural mode where the first position of in_order is considered in 
input, while the second one is in output. In fact under normal circumstances, 
the adoption of the stated delay declarations enforces nothing but a restriction 
to input consuming derivations. Moreover also other control mechanisms, such 
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as the one in Moded Flat GHC, are similar to requiring input consuming deriva- 
tions: the resolution of an atom with a definition must not instantiate the input 
arguments of the resolved atom. 

Input consuming programs allow for simpler definitions of denotational se- 
mantics and have nice properties regarding termination. Henceforth they seem 
to be a resonable and safe approximation to programs with general dynamic 
scheduling. In this paper we review and compare the different proposals given 
for denotational semantics of programs with input consuming derivations. We 
also show how they can be applied to termination analysis. Our review is based 
on [9], [10] and [12]. 

1.4 Structure of the Paper 

The paper is organized as follows. Section 2 contains some preliminary notations 
and definitions including input consuming programs. Section 3 introduces a first 
denotational semantics capturing computed answer substitutions of successful 
derivations. This semantics applies to well and nicely moded input consuming 
programs. In Section 4 a second denotational semantics for simply moded input 
consuming programs is presented which is able to model also intermediate re- 
sults of partial derivations. Section 5 shows how these semantics have been used 
to characterize termination properties of input consuming programs. Section 6 
concludes the paper. 

2 Preliminaries 

The reader is assumed to be familiar with the terminology and the basic results 
of logic programs and their semantics [1,2,29]. In this section we introduce few 
notions that will be used in the sequel. 

2.1 Terms and Substitutions 

Let T be the set of terms built on a finite set of data constructors C and a 
denumerable set of variable symbols V. For any syntactic object o, we denote 
by Var{o) the set of variables occurring in o. A syntactic object is linear if ev- 
ery variable occurs in it at most once. A substitution 0 is a mapping from V 
to T. Given a substitution a = {xi/ti, . . . , Xn/tn}, we say that {xi, . . . , Xn} is 
its domain (denoted by Dom{a)), and Var{{ti, . . . ,tn}) is its range (denoted 
by Ran{a)). Note that Var{a) = Dom{a) U Ran{a). We denote by e the empty 
substitution: Dom{e) = Ran{e) = 0. The result of the application of a substi- 
tution 0 to a term t is said an instance of t and it is denoted by t9. Given a 
substitution a and a syntactic object E, we denote by a\E the restriction of a to 
the variables in Var{E), i.e., (J\e{x) = cr(x) if x G Var(E), otherwise ctie(x) = x. 
If ti, . . . , is a permutation of a;i, . . . , then we say that cr is a renaming. The 
composition of substitutions is denoted by juxtaposition, i.e., x9a. We say that 
t is a variant of t', written t « t' , if t and t' are instances of each other. In this 
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case there exists a renaming 9 such that t' = tO. A substitution 6* is a unifier 
of terms t and t' if t9 = t'9. We denote by mgu{t,t') any most general unifier 
{mgu, in short) of t and t' . 



2.2 Programs and Derivations 

Let P be a finite set of predicate symbols. An atom is an object of the form 
p(ti, . . . , tn) where p € P is an n-ary predicate symbol and ti, ... fin & T. Given 
an atom A, we denote by Rel{A) the predicate symbol of A. A query is a finite, 
possibly empty, sequence of atoms Ai, . . . , Am. The empty query is denoted by 
□ . Following the convention adopted in [2], we use bold characters to denote 
sequences of objects: so, for instance, t denotes a sequence of terms, while B is a 
query (i.e., a possibly empty sequence of atoms). A (definite) clause is a formula 
iL ^ B where H is an atom (the head) and B is a query (the body). When 
B is empty, iL ^ B is written H ^ and is called a unit clause. A (definite) 
program is a finite set of clauses. We denote atoms by A, B, H, . ■ . , queries by 
Q,A,B,C,... , clauses by c, d, . . . , and programs by P. 

Computations are constructed as sequences of “basic” steps. Consider a non- 
empty query A,B,C and a clause c. Let id <— B be a variant of c variable 
disjoint from A, B, C and assume that B and H unify with mgu 9. The query 
(A, B, C)9 is called a resolvent of A, B, C and c with selected atom B and mgu 

9. A derivation step is denoted by A, B, C =^p^c (A, B, C)9. The clause iL ^ B 
is called its input clause. The atom B is called the selected atom of A, B, C. 

If P is clear from the context or c is irrelevant then we drop the reference to 
them. A derivation is obtained by iterating derivation steps. A maximal sequence 



S : Qo =^P,ci Ql =^P,C2 • ■ 'Qr. 



Qn+l ' 



is called a derivation of PU {Qo} provided that for every step the standardiza- 
tion apart condition holds, i.e., the input clause employed is variable disjoint 
from the initial query Qo and from the substitutions and the input clauses used 
at earlier steps. 

B 0 

Derivations can be finite or infinite. If 6 : Qo =^p,ci ••• =^p,c„ Qn is a 

0 

finite prefix of a derivation, also denoted by i5 : Qo — > Qn with 6* = 6*i • • • 0„, we 
say that (5 is a partial derivation and 6* is a partial computed answer substitution 
of P U {Qo}. If d is maximal and ends with the empty query, then 9 is called 
computed answer substitution (c.a.s., for short). In this case we say that the 
derivation is successful. The length of a (partial) derivation S, denoted by len{S), 
is the number of derivation steps in 6. 



2.3 Modes and Input Consuming Programs 

Modes are a common tool for verification. A mode is a function that labels as 
input or output the positions of each predicate in order to indicate how the 
arguments of such a predicate should be used. 
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Definition 1 (Mode). A mode for a predicate symbol p of arity n, is a function 
mp from {1, . . . , n} to {/, O}. 

We call moded atom (clause, program, query), any atom (clause, program, 
query) which has a mode associated to its predicate symbols. 

If mp{i) = I (resp. O), we say that i is an input (resp. output) position of 
p (with respect to mp). In the examples, we often indicate the mode by writing 
the atom p(mp(l), . . . , mp{n)), e.g., append)/, I , O). 

We assume that each predicate symbol has a unique mode associated to it; 
multiple modes may be obtained by simply renaming the predicates. We denote 
by In{Q) (resp. Out{Q)) the sequence of terms filling in the input (resp. output) 
positions of predicates in Q. Moreover, when writing an atom as p(s,t), we are 
indicating that s is the sequence of terms filling in its input positions and t is 
the sequence of terms filling in its output positions. 

The notion of input consuming derivation was introduced in [40] as a formal- 
ism for describing dynamic scheduling in an abstract way. 

Definition 2 (Input Consuming Derivation). 

— A derivation step A,B,C => (A,B,C)0 is input consuming if In{B)6 = 

In{B). 

— A derivation is input consuming if all its derivation steps are input consum- 
ing. 

In the following sometimes we use ic-derivation for input consuming deriva- 
tion and we call input consuming program (ic-program) a program when consid- 
ered with respect to input consuming derivations only. 

Example 3. Consider the program REVERSE with accumulator and the following 
modes: reverse)/, O) and reverse_acc(/, 0,1). 

reverse (Xs , Ys) reverse_acc (Xs , Ys , [] ) . 

reverse_acc ( [] , Ys , Ys) . 

reverse_acc ( [X I Xs] , Ys ,Zs) v- reverse^cc (Xs , Ys , [X I Zs] ) . 

The following derivation 6 of REVERSE U {reverse! [XI, X2] ,Zs)} is input con- 
suming. 

S: reverse! [XI, X2] ,Zs) reverse^cc! [X1,X2] ,Zs, [ ] ) 

reverse^cc ( [X2] , Zs , [XI] ) reverse_acc( [ ] ,Zs, [X2,X1] ) ^O. 

Allowing only input consuming derivations is a form of dynamic scheduling, 
since whether or not an atom can be selected depends on its degree of instantia- 
tion at runtime. Given a non-empty query, if no atom is resolvable via an input 
consuming derivation step and no failure arises, then we say that the query 
deadlocks. Therefore, an ic-derivation can either be successful or finitely failing 
or infinite or deadlock. Each ic-derivation which is not a deadlock is also an SLD 
derivation. 
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2.4 Classes of Moded Programs 

In the sequel we are going to refer to classes of programs that in some way 
behave well with respect to the given mode. In particular, we are going to use 
the concepts of well moded program (Dembinski and Maluszynski [20]), of nicely 
moded program (Chadha and Plaisted [15]) and of simply moded program (Apt 
and Etalle [4]). 

Definition 4 (Well, Nicely and Simply Moded Program). 

— Well Moded. A clause p{to,Sn+i) ^ pi(si, ti), . . . ,p„(s„, t„) is well 
moded if for all i € [1, n + 1] 



i-l 

Var{si) C Var{tj). 
j=o 

If we call producing positions the input positions of the head and the output 
positions of the body and consuming positions the other ones, then we can 
intuitively say that a clause is well moded if every variable in a consum- 
ing position occurs also in an earlier (w.r.t. the indices, which have been 
deliberately chosen in this way) producing position. 

~ Nicely Moded. A cZartse p(to, s„+i) ^ pi(si,ti), . . . ,p„(s„,t„) is nicely 
moded ift \, . . . ,tn is a linear sequence of terms, Var(to) H Var(ti, . . . ,t„) = 
0, and for all i G [1, n] 



Var{si) n [J Var{tj) = 0. 

3=i 

Intuitively a clause is nicely moded if there are no conflicts among producing 
positions, (a variable may appear in at most one producing position with one 
exception: a variable may appear twice in a producing position of the head) , 
and a variable may not be consumed before it is produced. 

— Simply Moded. A clause p{to,Sn+i) ^ pi(si, ti), . . . ,p„(s„, t„) is simply 
moded if it is nicely moded and ti, . . . , tn is a linear sequence of variables. 

— A query Q is well (resp. nicely, simply) moded, if the clause q ^ Q is well 
(resp. nicely, simply) moded, where q is a variable-free atom. 

Note that an atomic query p(s,t) is well moded if s is a sequence of ground 
terms and it is nicely moded if t is linear and Var(s) n Var(t) = 0. 

— A program is well (resp. nicely, simply) moded, if all of its clauses are well 
(resp. nicely, simply) moded. 

Hence the class of simply moded programs is a subclass of nicely moded ones 
and it includes both some well moded and some non- well moded programs. 

In [42] permutation well (nicely) moded programs and queries are also de- 
fined, i.e., programs and queries which would be well (nicely) moded after a 
permutation of the atoms respectively in the bodies and in the queries. 
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Example 5. 

— The program APPEND of the introduction in the mode append(/, I, O) is well 
nicely and simply moded. 

— REVERSE with accumulator of Example 3 is well and simply moded. 

— Furthermore, consider the following program PALINDROME 

palindrome (Xs) ^ reverse (Xs,Xs) . 

in the mode palindrome(/), together with the program REVERSE with the 
modes reversed, O) . This program is well moded but not nicely moded 
(since Xs occurs both in an input and in an output position of the same 
body atom) . However, since the program REVERSE is used here for checking 
whether a list is a palindrome, its natural modes are reversed, D and 
reverse_acc (/, /, A) . With these modes, the program PALINDROME is both 
well moded and simply moded. 

Most programs are simply moded (see the mini-survey at the end of [4]) and 
often programs that are not simply moded can naturally be transformed into 
simply moded ones (see [10]). 

The above notions of well, nicely and simply moded are “persistent” with 
respect to input consuming derivations. The following lemma is a straightforward 
extension of [5, Lemma 30]. 

Lemma 6. In a input consuming derivation, every resolvent of a well (resp. 
nicely, simply) moded query and a well (resp. nicely, simply) moded clause is 
well (resp. nicely, simply) moded. 

Notice that in the case of nicely and simply moded programs the above 
lemma depends on the fact that only input consuming derivations are considered. 
Indeed, when “normal” SLD derivations are considered persistence holds only 
when the leftmost selection rule is used. Otherwise, speculative bindings might 
destroy the property of being nicely moded. 

On the other hand, for well moded programs, any SLD resolvent of a well 
moded query with a well moded clause is well moded ( [2] ) . 

Finally, it is worth reminding that, when considering nicely (respectively 
simply) moded, input consuming programs, half of the famous switching lemma 
still applies. The following Left-Switching Lemma that has been proven in [10]. 

Lemma 7. (Left-Switching) Let the program P and the query Qo be nicely 
moded. Let 6 be a (partial) input consuming derivation of PU {Qo} of the form 



S : Qo 



Ql ' * * Qr. 



Qn-t-1 



Qn-t-2 



where 



— Qn is a query of the form A, A, B, B, C, 

— Qn+i is a resolvent of Qn and c„+i w.r.t. B, 

— Qn +2 is a resolvent of Qn+i and c „+2 w.r.t. A9n+\. 
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Then, there exist O'n+i’ ^n +2 ® derivation S' such that 

dn+ldn+2 = 



and 



S' :Qo 



' ' ' Qr< 






Qn. 



n + 2 '^n+l 



Qn +2 



where S' is input consuming and 

— S and S' coincide up to the resolvent Qn, 

— Q'n+i is 0 , resolvent of Qn and Cn +2 w.r.t. A, 

— Qn+2 is a resolvent o/Q^_i_i and Cn+i w.r.t. 

— S and S' coincide after the resolvent Qn+2- 



2.5 The «S-semantics 

The aim of the 5-semantics approach (see [13]) is modeling the observable beha- 
viors for a variety of logic programming languages. The observable we consider 
here is the computed answer substitutions. The semantics is defined as follows: 

S{P) = { p{xi, . . . , Xn)0 I xi, . . . ,Xn are distinct variables and 

p{xi, . . . , x„) — >p □ is an SLD derivation}. 

This semantics enjoys all the valuable properties of the least Herbrand model 
as summarized below in the following. To present the main results on the S- 
semantics we need to introduce two further concepts: Let P be a program, and 
/ be a set of atoms closed under variance. 

~ The immediate consequence operator for the 5-semantics is defined as: 

T0(I) = { H0 13 H ^ B variant of a clause of P 
3 C G /, renamed apart"'^ w.r.t. H,B 
9 = mgu(B, C)|. 

~ / is called an S -model of P if Tp{I) C I. 

Falaschi et al. [25] showed that Tp is continuous on the lattice of term interpreta- 
tions, that is sets of possibly non- ground atoms, with the subset-ordering. Powers 
of the operator Tp are defined in the standard way as follows: Tp } 0(/) = I, 
T$ U^+ m) = Tf{Tf r ^(/)), and Tf } u;(/) = U“o Tf T i{I)- We 
abbreviate Tp } (+{%) to Tp } uj. In [25] they proved the following: 

— 5(P) = least 5-model of P = Tp } w. 

^ Here and in the sequel, when we write “C G I, renamed apart w.r.t. some expression 
e” , we naturally mean that I contains the atoms C'l, ... , C'„ , and that C is a renaming 
of Cl , ... , C'n such that C shares no variable with e and that two distinct atoms of 
C share no variables with each other. 
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Therefore, the 5-semantics enjoys a declarative interpretation and a bottom- 
up construction, just like the Her brand one. In addition, we have that the S- 
semantics reflects the observable behavior in terms of computed answer substi- 
tutions, as shown by the following well-known result. 

Theorem 8 ([25]). Let P he a program, A he a query. The following statements 
are equivalent: 

•d 

— there exists an SLD derivation A — 

— there exists A' G 5(P) (renamed apart w.r.t. A), such that a = mgu{A, A'), 
where Act « A'd. 

Example 9. Let us see this semantics applied to the programs APPEND and 
REVERSE so far encountered. 

— ^(APPEND) = { append ([] ,X,X), 

append ([XI] ,X, [X1|X]), 

appendC [XI ,X2] ,X, [XI ,X2 I X] ) , . . . }. 

— ^(reverse) = { reverse ([] , []), 

reverse ( [XI] , [XI] ) , 
reverse ( [XI , X2] , [X2 , XI] ) , 

reverse^ccC [] ,X,X), 
reverse^cc ( [XI] ,X, [XI I X] ) , 
reverse^ccC [X1,X2] ,X, [X2,X1 |X] ), . . . }. 

2.6 Semantics of Input Consuming Programs 

In Sections 3 and 4 we present two semantics for input consuming programs 
which are related to 5-semantics. To define such semantics, the observables we 
focus on are the computed answer substitutions. First, we consider a seman- 
tics given by the computed answer substitutions of successful derivations. This 
corresponds to the 5-semantics of logic programming [13] when restricted to a 
particular set of queries. Given a program P and a set of queries C, this semantics 
can be defined formally as 

0'‘g{P, C) = {A0| A G C and there exists an ic-derivation A — >p □}. 

While this semantics appears very natural, it can be unsuitable for modelling 
the reactive nature of input consuming programs. In fact, as we mentioned in 
the introduction, input consuming derivations can be used to model dynamic 
scheduling and parallelism, and in this context it is very important to model the 
results of partial computations. Indeed, the standard semantics for concurrent 
logic languages such as ccp [39,22] and GHC [44] often capture such intermediate 
results, or in any case, also the results of non-successful computations [16]. In 
fact, the (partial) result of a computation may trigger another computation by 
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instantiating sufficiently the input positions of another atom so that it becomes 
resolvable. Because of this, when one wants to characterize for instance termi- 
nation, the adoption of a semantics which is able to model intermediate results 
becomes essential, as shown in Section 5. Thus we also consider a semantics 
capturing the results of partial input consuming derivations. Given a program 
P and a set of queries C, this semantics can be defined formally as 

0^p{P,C) = {A0| A e C and there exists an ic-derivation A B}. 

where B is any query. 



3 Semantics of Well Moded Input Consuming Programs 

To characterize our two semantics for ic-programs, we start from the simplest 
case: when one is interested only in the successful derivations. Then - if one does 
not restrict to ic-derivations - the observables (given by successful derivations) 
can be captured by the 5-semantics of classical logic programs. 

In this section we show that the standard S-semantics is compositional and 
correct also for input consuming programs, provided that the programs are well 
and nicely moded and that only nicely moded queries are considered. The results 
reported in this section are proved in [9] . 

Proposition 10. Let P he a well and nicely moded program, A he a nicely moded 
atomic query. The following statements are equivalent: 

(i) there exists an input consuming derivation A-^pU, 

(ii) there exists A! G 5(P) (renamed apart w.r.t. A), and a = mgu{A,A') such 
that In{A)a « In{A), 

where Aa « Ai9. 

To extend Proposition 10 to arbitrary (non-atomic) queries we need the fol- 
lowing definition. 

Definition 11. Let A = pi(si,ti), . . . ,p„(s„,t„) he a query. We define 



n 2—1 

VIn*{A) := U{ x\ X G Var(si) and x ^ 

i=l j=l 

VIn*{A) denotes the set of variables occurring in an input position of an atom 
of A but not occurring in an output position of an earlier atom. Note that if A 
is well moded then VIn*{A) = 0. 

Theorem 12. Let P he a well and nicely moded program, A he a nicely moded 
query and NM he the class of nicely moded queries. The following statements 
are equivalent: 
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(i) there exists At) G NM), 

(a) there exists A' G S{P) (renamed apart w.r.t. A), and a = mgu{A, A') such 
that A(T|v'/n*(A) ~ A, 

where Aa « At). 

The condition Aa\vin*{A) ~ A above says that the substitution a just renames 
the variables occurring in an input position of A but not occurring in an output 
position of an earlier atom. In case of an atomic query A := A, we might 
substitute this condition with the somewhat more attractive condition In{A)a « 
In{A) of Proposition 10. 

Theorem 12 shows thus that S{P) is compositional and correct for input 
consuming programs, provided that programs are well and nicely moded and 
that queries are nicely moded. In other words, given the restrictions on programs 
and queries, the 5-semantics is correct with respect to the observables given by 
the computed answer substitutions of successful ic-derivations. 

Example 13. Consider the program APPEND of the introduction with the mode 
appendd, I, O) . 5(APPEND), reported in Example 9, allows us to draw a number 
of conclusions: 

— append! [X,b] ,Y,Z) has an input consuming successful derivation. 

In particular, it has an input consuming derivation with c.a.s. {Z/[X,b|Y]}. 
This can be derived by just looking at 5(APPEND), from the fact that A = 
append! [XI, X2] ,X3, [X1,X2|X3] ) G S{P) and that append! [X,b] ,Y,Z) is 
- in its input positions - an instance of A. 

— append !Y, [X,b] ,Z) has no input consuming successful derivations. 

This is because there is no A G S{P) such that append(Y, [X, b], Z) is an 
instance of A in the input positions. 

— Observe that the query append !Y, [X,b] ,Z) has infinitely many successful 
SLD derivations and no failures. Therefore it does not fail also when we con- 
sider ic-derivations. Since, as noted above, the query has no input consum- 
ing successful derivations, this implies that - in presence of input consuming 
derivations - append !Y, [X,b] ,Z) will eventually either deadlock or run into 
an infinite derivation. 

The previous results hold also in case the programs are permutation well and 
nicely moded and queries are permutation nicely moded [42]. 

While in the context of SLD (not input consuming) derivations the 5-seman- 
tics is also fully abstract, when considering input consuming program this is not 
so. Consider the following two trivial programs: 

PI ={cl: p!X). 

c2: p!a). } 

P2 ={ p!X). } 

In both programs the mode is p!J) . These two programs, despite being different, 
yield exactly the same computed answer substitutions for all queries when ic- 
derivations are considered. In fact the extra clause c2 in PI can resolve an atom 
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A only if A contains the term a in its input position, but in this case c2 behaves 
exactly as cl does®. Nevertheless, the 5(P1) = {p(X),p(a)} yf {p(X)} = 5(P2), 
demonstrating that the 5-semantics is not fully abstract when considering ic- 
derivations. In the next section we present a more complex semantics, which is 
also fully abstract for ic-derivations. 



4 Semantics of Simply Moded Input Consuming 
Programs 

The semantics presented in the previous section applies only when we are in- 
terested in the computed answer substitutions of successful derivations. As we 
discussed before, there are many situations in which we also want to model 
the (intermediate) results of partial derivations. For instance, this will be the 
case when - in the next section - we study the termination of input consuming 
programs. 

In this section we define a somewhat more complex denotational semantics 
which has the advantage of modelling the observables given by both successful 
and partial derivations in a rather symmetric way. The two semantics we are 
going to introduce are compositional, correct and fully abstract with respect 
to the operational semantics of input consuming simply moded programs and 
queries, i.e., 0'f{P, SM) and 0^p{P, SM), where SM is the class of simply moded 
queries. As in the standard 5-semantics, this is a denotational semantics that 
can be built by means of a bottom-up construction. 



4.1 Simply Local Substitutions and Simply Local Models 

When input consuming derivations are applied to simply moded programs and 
queries, important properties follow from the way clauses become instantiated 
along the derivations. The notion of simply local substitution is introduced in 
[12] to reflect this instantiation mechanism. A clause c = PI ^ Bi, be- 
comes instantiated by its “caller” (the atom that is resolved using c) and its 
“callees” (the clauses used to resolve the body atoms of c). Thus, a simply local 
substitution is defined as the composition of several substitutions, cto, cti . . . , ct„, 
one for each atom in the given clause, such that (Jq binds the input variables of 
the head of the clause, and each at {i > 0) creates a binding from the output 
variables to input terms of Biao, . . . , ai-\. 

Definition 14 (Simply Local Substitution). Let 0 be a substitution. We 
say that 9 is simply local w.r.t. the clause PI ^ Bi, . . . , B„ if there exist substi- 
tutions (To, (Ti . . . ,cTn and disjoint sets of fresh (w.r.t. c) variables vq, vi, . . . , Vn 
such that 6 = gqGi • • • cr„ where 

® The only observable difference between PI and P2 lies in the multiplicity of the 
answers: the query q(a) succeeds twice in PI and only once in P2, but answer mul- 
tiplicity is not an observable we consider here. 




104 



Annalisa Boss! et al. 



— Dom{(Jo) C Var{In{H)) and Ran{ao) C vq, 

— for i G [l-.n], 

Dom{ai) C Var{Out{Bi)) and Ran{ai) C Var{In{Bi)aoai ■ ■ ■ Gi-i) 

The substitution 9 is simply local w.r.t. a query B if 9 is simply local w.r.t. the 
clause g ^ B where q is any variable-free atom. 



Example 15. Consider the program APPEND together with the mode append (/, 
I, O) and its recursive clause 

c: append( [H|Xs] , Ys, [H|Zs] ) append(Xs, Ys, Zs). 

The substitution 9 — {Xs/[], Ys/W, Zs/W} is simply local w.r.t. c. In fact 9 = 
(ToCTi where (Tq = {Xs/[],Ys/W} and a\ = {Zs/W}. Consider now the query 

Q : append( [a, X, c] , Ys, Zs), append(Zs, [b] , Ls). 

The substitution 9 = {Zs/ [a,X,c I Ys]} is simply local w.r.t. Q. In fact 9 = <ti<J 2 
where = {Zs/ [a,X, c I Ys] } and is the empty substitution. 

The denotational semantics we are about to define is based on a restricted 
notion of model. Here and in the sequel interpretations are sets of moded atoms 
closed under variance. 

Definition 16 (Simply Local Model). Let M be a set of moded atoms. We 
say that M is a simply local model of a clause c : H ^ Bi, . . . , B„ if for every 
substitution 9 simply local w.r.t. c, 

if Bi9,...,Bn9 G M then H9 G M. (1) 

M is a simply local model of a program P if it is a simply local model of each 
clause of it. 

Clearly a simply local model is not necessarily a model in the classical sense, 
since the substitution 9 in (1) is required to be simply local. For example, given 
the program |q(l)., p(X) ^ q(X).| with modes q(/), p(0), a model must contain 
the atom p(l), whereas a simply local model does not necessarily contain p(l), 
since {X/l| is not simply local w.r.t. p(X) ^ q(X). 

A minimal simply local model exists and it is bottom-up computable by 
applying the following operator [12]. 

Definition 17. Given a program P and a set of moded atoms I, we define 

= / U {H9 \ 3 c:H ^ B variant of a clause of P, 

9 is simply local w.r.t. c, 

B6I G /} 
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Tp^ is monotonic and continuous on the lattice where sets of moded atoms 
are ordered by set inclusion. 

In the following we denote by SM p the set of all simply moded atoms of the 
extended Her brand universe of P. In [12] it is proven that if P is simply moded 
and I C SM p then 

pSL 

t w(/) is the least simply local model of P containing / (2) 

This allows us to define our models. 

Definition 18. Let P he a program. We define 

— Mp^ is the least simply local model of P, 

— PM p^ is the least simply local model of P containing SM p . 

The existence of these models is guaranteed by (2), in fact (2) also shows how 
to construct them, as it implies that 

t w(0), and PM$^ = ] w{SMp) (3) 

4.2 Relation among Denotational and Operational Semantics 

To relate the Mp^ and PMp^ to 0^g{P,SM) and 0^p{P,SM) we need to re- 
late Tp^ to the results of input consuming derivations; this is achieved in the 
following lemma, proved in [12]. 

Lemma 19. Let the program P and the query A he simply moded and L C SM p 
he a set of moded atoms. The following statements are equivalent: 

'd 

(i) there exists an input consuming derivation A — >p C with C C /, 

(ii) there exists a substitution 6 simply local w.r.t. A, such that A9 C Tp^ | 

where Aid « Ad. 

We can now prove that Mp^ and PMp^ fully characterize the semantics of ic- 
derivations for simply moded programs and queries, namely they are equal to 
0^g{P, SM) and 0^p{P, SM), respectively. 

Theorem 20. Let P he simply moded. Then 

(i) M^^ = Of (P,SM). 

(ii) PM^^ = Of{P,SM). 

Proof. Lmmediate hy (3), Lemma 19 and the definitions of Of {P,SM) and 
Of{P,SM). 



An example follows. 
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Example 21. Let us consider again the program APPEND. 

1. First let us consider its successful ic-derivations. Hence we have to build 

-'^-'APPEND 

^APPEND = {append([ti, . . . ,t„], s, [ti, . . . ,t„|s]) | n G [O..oo], 
and ti, . . . ,tn, s are any terms}. 

Notice that this model is different from ^(APPEND), reported in Example 9. 
We are going to relate S{P) and Mp^ later in this section. 

2. Now let us consider the results of partial derivations. Recall that 

is obtained by repeatedly applying Tp^ to each simply moded atom. Simply 
moded atoms are append(s, t, a;) where s and t are arbitrary terms but x is 
a variable not occurring in s or in t. We obtain 

APPEND “ -'^-'APPEND 

U {append(s, t, a;) | a; is a fresh variable } 

U {append( [ti, . . . , , t, [ti, . . . ,tm\x]) | a; is a fresh variable} 

where s, t, ti, . . . , tm are arbitrary terms. 

Consider now the query append( [a, b, c|X] , Y, Z). It is straightforward to 
check that the substitution 0 = {Z/ [a, b|Z'] } is simply local w.r.t. it, and that 
append( [a, b, c|X] , Y, Z)6* G RM^ppgyp. Therefore, by using Theorem 20, we 
can conclude that there exists a partial derivation starting in append( [a, b, 
c|X],Y, Z), with computed answer 9. Following the same reasoning, one can 
also conclude that the query has a partial derivation with computed answer 
0' = {Z/[a|Z']}. 



4.3 Relation between <S-semantics and Denotational Semantics for 
IC-programs 

In this section we compare the denotational semantics Mpf" with the 5-semantics 
S{P) of simply moded programs. 

First, we need a new definition: let / be a set of moded atoms, the input 
closure of / is defined as: 

InCl{I) = {A9 I H G / and Var{A) n Var{6) C Var{In{A))} 

So the input closure of an atom is obtained by instantiating its input positions 
in all possible ways, provided that no new links are created between the input 
and the output positions. 

Theorem 22. Let P he a well and simply moded program, then 



Mf^ = InCl{S{P)) 
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Proof. First observe that the class of simply moded programs is contained in 
the class of nicely moded programs, hence Theorem 12 is applicable also when 
we consider well and simply moded programs and simply moded queries. 

- Mp^ C InCl{S{P)). Let A be simply moded and assume Ai} G Mp^. Then, 
by Theorem 20, Ad G Of{P,SM). By Theorem 12 there exists A' G S(P) 
(renamed apart w.r.t. A), and a = mgu{A, A') such that In{A)a « In{A) and 
Aa « Ad. Since A is simply moded, we can choose cr such that Dom{a) n 
Var(A') C Var{In{A')). Therefore Ad « Aa = A'a G InCl{S{P)). 

- Mp^ A InCl{S{P)). Let A9 G InCl{S{P)) and A = p(s,t) G S{P). There 
exist a simply moded atom A! = p(s',z), renamed apart w.r.t. A, and a substi- 
tution a such that a = mgu{A,A'), In{A')a = In{A') and A'a = Aa « A9. By 
Theorem 12 there exists d such that A'd G 0'f{P,SM) and A'd « A'a ~ A9. 
Hence, by Theorem 20, A9 G Mp^. 

5 Semantic-Based Verification of Termination 

There have been only few proposals which tackled the specific problem of ver- 
ifying the termination of logic programs with dynamic scheduling, namely by 
Apt and Luitjes [5], Marchiori and Teusink [30] and Smaus. Input consuming 
derivations were indeed introduced by Smaus in [40] to simplify the study of 
program properties which depend on selection rules and in [41] he started to 
study in particular the problem of termination of input consuming derivations. 

In [10] and [12] we study two classes of programs terminating with respect 
to input consuming derivations and well- formed queries. The two classes differ 
in various aspects. First of all, two different classes of well- formed queries are 
considered: nicely moded queries in [10], simply moded queries in [12]. To give 
an uniform presentation, in [12] we consider a parametric class of programs in 
which all input consuming derivations terminate. The parameter is a given class 
C of queries. 

Definition 23 (Input Termination w.r.t. a class C of queries). Let C be 

a class of queries. A program is called input terminating with respect to C if all 
its input consuming derivations started in a query in C are finite. 

The second difference among the two classes of terminating programs in [10] 
and [12] is in the termination proof techniques. The first class follows the style of 
[3,8] and it uses a simple (syntactic) termination condition, but it is also a rather 
restrictive class. The second class follows the style of [6,7], that is based on a 
more complex model theoretic approach, and it uses the semantics introduced 
in Section 4; this is a significantly larger class of programs. 

Let us consider first the more restrictive and simple class introduced in [10]: 
The class of nicely moded quasi recurrent programs. Its definition is based on 
the notion of well moded level mapping, first introduced in [21]. Here we use 
well moded level mappings extended to all the terms on Bp as in [10]. Bp, the 
extended Herbrand base of P, is the set of equivalence classes of all (possibly 
non-ground) atoms, modulo renaming, whose predicate symbols appear in P. 
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Definition 24 (Moded Level Mapping). Let P be a program and Bf, be the 
extended Herbrand base for the language assoeiated with P. A function | | is a 
moded level mapping for P if: 

— it is a function \ \ : Bf, — > N from atoms to natural numbers; 

— for any t and u, |p(s,t)| = |p(s,u)|. 

For A G Bp, \A\ is the level of A. 



Definition 25 (Quasi Recurrency). Let P be a program. 

— A clause of P is called quasi recurrent with respect to a moded level mapping 
I I if for every instance H ^ A, B,C of it, 

if Rel{H) ~ Rel{B) then \H\ > \B\^ . 

— P is called quasi recurrent with respect to | | if all its clauses are. P is 
called quasi recurrent if it is quasi recurrent with respect to some moded 
level mapping \ \ : Bp N. 



Theorem 26. Let P be a nicely moded program. Lf P is quasi recurrent then P 
is input terminating with respect to the class of nicely moded queries. 

The proof of this theorem can be found in [10]. 

Thus, the quasi recurrent condition is a sufficient condition for input ter- 
mination of nicely moded programs and nicely moded queries. But it is not a 
necessary condition: there are nicely moded programs input terminating on all 
nicely moded queries which are not quasi recurrent as shown by the following 
simple example taken from [10]. 

Example 21. Consider the following program with moding p(l, 0). 

p(X,a) <-p(X,b). 
p(X,b) . 

This program is clearly input terminating, however it is not quasi recurrent. 
For the first clause to be quasi recurrent it would have to be the case that 
|p(X, a)| > |p(X,b)|, for some moded level mapping | |. On the other hand, since 
p(X, a) and p(X, b) differ only for the terms filling in their output positions, 
by definition of moded level mapping, |p(X, a)| = |p(X,b)|. Hence, we have a 
contradiction. 

A full characterization can be obtained only by further restricting the class 
of programs, passing from nicely moded to simply moded and input-recursive 
programs. 

® Given two predicate symbols defined in a program P we denote by p ~ g the fact 
that the definitions of the two predicates are mutually recursive. 
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Definition 28 (Input-Recursive Program). Let P be a program. 

— A clause H ^ A, B,C of P is called input-recursive if 

if Rel{H) ~ Rel{B) then Var{In{B)) C Var{In{H)). 

— A program P is called input-recursive if all its clauses are. 

Input-recursive is a syntactic condition on a clause requiring that the set 
of variables occurring in the arguments filling in the input positions of each 
recursive call in the clause body is a subset of the set of variables occurring in 
the arguments filling in the input positions of the clause head. The class of input- 
recursive programs has strong similarities with the class of primitive recursive 
functions and recurrent logic programs. It does not include programs whose 
termination depend on the so-called inter- argument relations such as quicksort. 

Quasi recurrency fully characterizes input termination of simply moded and 
input-recursive programs with respect to nicely moded queries. 

Theorem 29. Let P he a simply moded and input-recursive program. P is quasi 
recurrent if and only if P is input terminating with respect to the class of nicely 
moded queries. 

The proof of this theorem can be found in [10]. 

To consider a larger class of input terminating programs we can follow the 
same approach pursued by Apt and Pedreschi in defining acceptable programs 
and use a model to capture the inter-argument relations between the atoms in 
a query. Intuitively, the model represents all the possible contexts in which a 
specific atom in a query can be called. Standard models suffice when standard 
left-to-right derivations are considered, that is when the contexts depends only 
on the computed answers of the atoms occurring on the left of the considered 
one. When input consuming derivations are considered, the description of all the 
possible contexts is much more complex since there may be atoms in the query 
which are only partially computed when the considered atom is selected. Hence a 
computed answer semantics does not provide enough information, which is why 
we need to capture partial computed answers of input consuming derivations. 

The semantics defined in [12] and the concept of simply local model give us 
the right tools and allow us to identify a large class of input terminating programs 
which includes also programs employing a non-trivial recursion scheme such as 
quicksort, permute, transpose. In fact, based on the notion of simply local 
models, in [12] we introduced the notion of simply acceptable programs which 
corresponds to the notion of acceptable programs introduced in [6] . 

Definition 30 (Simply Acceptable Program). Let P he a program and M 
a simply local model of P containing SM p . 

— A clause c of P is simply acceptable with respect to a moded level mapping 
j j and M if for every variant H ^ A,B,C of c and every substitution 9 
simply local with respect to c, 

if A9 € M and Rel{H) ~ Rel{B) then \H9\ > \B9\. 
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~ P is simply acceptable with respect to M if there exists a moded level map- 
ping I I such that each clause of P is simply acceptable with respect to \ \ and 
M . We also say that P is simply acceptable if it is simply acceptable with 
respect to some M and moded level mapping \ |. 

Simple acceptability fully characterizes input termination of simply moded 
programs with respect to simply moded queries. 

Theorem 31. Let P be a simply moded program. P is simply acceptable if and 
only if it is input terminating with respect to simply moded queries. 

The following example shows how we can use the above theorem to reason 
about termination of a program. 

Example 32. Consider the following PERMUTE program 

permute ( [X I Xs] ,Ys) ^ insert(Zs,X,Ys) , permute (Xs , Zs) . 
permute ( [],[]). 

insert ( [] ,X, [X] ) . 

insert ( [U I Xs] , X , [U I Zs] ) ^ insert (Xs , X , Zs) . 

We consider it with two different modes. 

1. First, consider the mode permute(0, /), insert(0, 0,1). 

Notice that the program is not input terminating in this mode: by repeat- 
edly selecting the rightmost atom, the query permute (Xs,Ys) generates an 
infinite input consuming derivation. By Theorem 31, we can prove it by show- 
ing that PERMUTE in this mode cannot be simply acceptable with respect to 
PMpj^fiuTE and a moded level mapping which is invariant under renaming. 
First note that contains every atom of the form insert(Us, U, t) 

where Us and U are disjoint from t, i.e., every simply moded atom whose pred- 
icate is insert. Therefore, in particular, insert(Us, U, Vs) S ^dlp^^upg. The 
substitution 6 = {Ys/Vs, Zs/Us, X/U} is simply local w.r.t. the first clause. 
Therefore, for this clause to be simply acceptable, by Theorem 31, there 
would have to be a moded level mapping, invariant under renaming, such 
that |permute( [U|Xs] , Vs)| > |permute(Xs,Us)|. This is a contradiction since 
a moded level mapping depends only on the input arguments (the second ar- 
gument of permute) and we are considering a level mapping invariant under 
renaming. 

Thus Theorem 31 can be used to diagnose a program, in that we can pinpoint 
why it does not input terminate. 

2. Now consider the program PERMUTE together with the mode permute)/, O), 
insert)/, /, O). 

In this case, in order to make the program simply moded we have to permute 
the two body atoms of the first permute clause^. I.e., permute is redefined 
as 

^ Actually, everything we state applies to the class of permutation simply moded 
programs, i.e., those programs and queries that are simply moded possibly after a 
permutation of body atoms. For the sake of notation simplicity, we avoid to refer to 
this in a structural way. 
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permute ( [X I Xs] ,Ys) ^ permute (Xs , Zs) , insert(Zs,X,Ys) . 
permute ( [],[]). 

Notice that the program is now input terminating with respect to simply 
moded queries. This is in fact the natural mode of the PERMUTE program. 
To demonstrate the termination one can apply Theorem 31 using any sim- 
ply local model containing SM p together with the following moded level 
mapping: 

|permute(Z, _)| = len{l), 

|insert(^, = len{l). 

6 Conclusion 

In this paper, we have illustrated two denotational semantics proposed in [9], 
[10] and in [12] for input consuming derivation in logic programs and we have 
shown how these semantics have been used for studying termination properties 
of such programs. 

While the first semantics (introduced in [9]) models exclusively the results 
of successful derivations and requires programs to be well moded and nicely 
moded, the second one (introduced in [12]) models also the results of incomplete 
derivations and requires programs and queries to be simply moded. 

As mentioned in the introduction, in the context of parallel and concurrent 
programs, one can have derivations that never succeed, and yet compute sub- 
stitutions [36]. Thus we have provided a denotational semantics also for such 
programs, which goes beyond the usual success-based SLD resolution mecha- 
nism of logic programming. 

Input consuming derivations bear a certain resemblance with derivations in 
the language of Moded (Flat) GHC [45]. Actually, input consuming programs can 
be seen as a simplified version of moded (F)GHC. We want to note however that 
Moded (F)GHC is a full-fledged programming paradigm, while input consuming 
programs are meant for abstraction purposes. 

As a concluding remark, we want to stress the relation between ic-programs 
and programs that use delay declarations. A significant class of programs with 
delay declarations whose derivations are input consuming derivations has been 
identified in [11]. 
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Abstract. This paper aims at offering an insightful synthesis of differ- 
ent compositional semantics for logic program composition which have 
been developed in the literature. In particular, we will analyse the no- 
tions of program equivalence, compositionality, and full abstraction for 
logic programs. We will show how the notion of supported interpretation 
provides a unifying compositional model-theoretic characterisation both 
of positive programs and of programs containing negation. 



1 Introduction 

Building complex software systems by combining existing components is a stan- 
dard methodology of modern software development. The effectiveness of the 
program composition approach depends on the possibility of reasoning on the 
composition process itself. The availability of well-founded characterisations of 
programs and program compositions is needed to perform transformation, anal- 
ysis and verification. 

One of the most important relations between programs (in any programming 
language) is program equivalence. This relation is at the basis of most, if not all, 
programming methodologies. Each method of giving a semantics to programs in- 
duces an equivalence relation on programs. It is therefore essential to understand 
how these equivalences are related. 

As pointed out in [30], different formulations that define identical equiva- 
lences offer different frameworks in which to reason about programs. Moreover, 
stronger equivalence relations may be used to reason about programs and en- 
sure that the programs are equivalent in a weaker sense, which might not be as 
suitable for reasoning. Reasoning about programs concerns also the correctness 
of source-to-source transformations such as those occurring in program develop- 
ment [35]. 

The properties of compositionality and full abstraction play a crucial role in 
the study of the semantics of programming languages. Simply stated, a seman- 
tics is compositional (or homomorphic) if the meaning of a program can be ob- 
tained from the meaning of its components. If a semantics is compositional with 
respect to some composition operations then the induced equivalence relation 
is a congruence for those operations. This property establishes a firm founda- 
tion for reasoning about programs and program transformations. Suppose that 
a program P consists of two parts, Q and R say, suitably composed together. 
Suppose also that R' is a more efficient version of R, obtained for instance by 
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applying some program transformation technique to R. If R' is equivalent to R 
in the chosen semantics then the property of compositionality ensures that the 
substitution of R' for R will not affect the meaning of the whole program P. 

Often the semantics O describing the observable behaviour of programs is 
not compositional. In these cases it is then necessary to consider a more dis- 
tinguishing (or finer) semantics S which preserves O and which is a congruence 
for the set of compositions considered. The compositionality of S ensures that 
programs (or program parts) which are 5-equivalent can be replaced with one 
another without affecting the intended semantics O of the whole system. The 
property of full abstraction establishes that the equivalence relation induced by 
S is the largest equivalence relation that can be used to substitute programs (or 
program parts) without affecting the intended semantics of the whole system. 

In this paper we analyse the properties of compositionality and full abstrac- 
tion in the context of logic programming. Indeed, because of the declarative 
programming style it features, logic programming can be fruitfully employed as 
the specification language of software components. Logic programming supports 
a wealth of programming styles developed in algorithmic programming, database 
programming, and artificial intelligence programming via a small number of pow- 
erful features (unification, recursion, and nondeterminism). On the other hand, 
logic programming has firm foundations in mathematical logic. The availability 
of different equivalent characterisations of programs offers the ground to perform 
sound semantics-based transformation, analysis and verification. 

In this paper, we focus on the most basic composition operation over logic 
programs, the union of programs, and we analyse and compare different seman- 
tics that have been proposed in the literature. In the perspective of providing an 
insightful synthesis of these semantics, we will show how the notion of supported 
interpretation provides a unifying characterisation of both positive programs and 
of programs containing negation. Notice that the aim of this paper is not to pro- 
vide a comprehensive survey of the compositional semantics for logic programs 
which have been proposed in the last ten years. (The interested reader may re- 
fer to [10] for a survey which covers also different modular extensions of logic 
programming. ) 

The rest of the paper is organised as follows. 

Section 2 introduces some background material, namely some logic program- 
ming terminology, the notions of compositionality and full abstraction, and a 
hierarchy of logic program equivalences. 

Section 3 is devoted to analyse compositional semantics of definite programs. 
We first analyse three equivalence relations considered in [30]: Subsumption 
equivalence, weak subsumption equivalence and logical equivalence. We show 
that while they are all compositional, logical equivalence is the fully abstract 
relation. We then consider a different model-theoretic characterisation, based on 
the notion of admissible model presented in [10]. The relation between admissible 
models and the other semantics is illustrated, and a fully abstract refinement of 
admissible models is presented here for the first time. The section is concluded by 
introducing the notion of supported interpretation which provides an alternative 
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characterisation of logical equivalence and which will be used also to charac- 
terise normal programs. The results presented in this section are summarised 
in Figure 2 which contains a hierarchy of compositional semantics for definite 
programs. 

Section 4 is devoted to analyse compositional semantics for normal programs, 
that is, programs containing negation. Two main problems arise here: (1) the 
existence of many “intended” semantics for normal programs, and (2) the or- 
thogonality of non-monotonicity and compositionality. We will show that the 
notion of supported interpretation introduced in Section 3 provides a unifying 
characterisation of a number of “intended” semantics for normal programs. A 
general full abstraction result will be also presented here for the first time. 

Finally, Section 5 briefly discusses other forms of program compositions, while 
Section 6 contains some concluding remarks. 

To simplify the reading, all proofs are reported in the Appendix. 

2 Preliminaries 

2.1 Logic Programming 

We will use the standard definitions and terminology of logic programming, as 
reported for instance in [2,29]. A definite logic program is a finite set of clauses of 
the form A <— Bi , . . . , (n > 0), where A,B\,. . .,B„ are atoms. A normal logic 
program is a finite set of clauses of the form A ^ Bi, . . . , Bn (n > 0), where A 
is an atom and where Bi,. . .,B„ are possibly negated atoms. 

Clauses without premise part, i.e., of the form A will be called extensional 
(or unit) clauses, and programs containing only extensional clauses will be called 
extensional programs. We will also denote by Defs{P) the set of predicates 
defined in a program P. 

We will use the standard notions of Herbrand interpretations and Herbrand 
models, and we will denote by LHM{P) the least Herbrand model of a pro- 
gram P. We will also use the definition of the standard immediate consequence 
operator T{P): 

T{P){I) = {A\3B ■. A^B € ground(P) ABCI} 

where P is a definite program, B denotes a (possibly empty) conjunction of 
atoms, and ground{P) denotes the set of ground instances of clauses of P. 



2.2 Compositionality and Full Abstraction 

A semantics for a programming language provides meanings for programs or, 
more generally, program parts. Moreover, each method of giving semantics to 
a programming language induces an equivalence relation on programs. Namely, 
two programs are equivalent if and only if they have the same meaning in the 
chosen semantics. 
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An equivalence relation =i is finer than another equivalence relation =2 
(=1 C = 2 ) if and only if whenever P =\ Q then P =2 Q. Furthermore =1 is 
strictly finer than =2 (=1 C = 2 ) if and only if =1 is finer than =2 and =2 is not 
finer than = 1 . 

The properties of compositionality and full abstraction have been recognised 
as two fundamental concepts in the studies on the semantics of programming 
languages [32,36]. Informally, a semantics is compositional if equivalent programs 
(or program parts) are indistinguishable, that is, if they exhibit equal observable 
behaviour in all possible context. On the other hand, a semantics is fully abstract 
if indistinguishable programs (or program parts) are equivalent. 

The formal definition of these properties relies on the notion of observable 
behaviour of a program and on the notion of program composition. The for- 
mer can be represented by a mapping O which associates with every program 
P an object 0{P) denoting the observable behaviour of P. The latter can be 
represented by a set Com of (possibly partial) functions over programs. 

A semantics is compositional if the induced equivalence is compositional for 
the pair (0,Com), that is, if it preserves the observables and is a congruence 
for the set of compositions. Formally, an equivalence relation = is compositional 
for (O, Com) if and only if: 

1. = preserves O, that is VP, Q ■. P = Q C>{P) = 0{Q). 

2. = is a congruence for Com, that is VP G Com, VPi, . . . , P„, Qi, . . . , Qn- 

P^ = Qi (i = l,...,n) P(Pi,...,P„) =P(Qi,...,(5„). 

There is always a coarsest congruence for {O, Com), which is intuitively the 
“indistinguishability relation” . A semantics is fully abstract if the induced equiv- 
alence includes this largest congruence. A semantics is both compositional and 
fully abstract if it coincides with it. Two programs P and Q are distinguishable 
under (0,Com) if there exists a context C[.] (defined via Com) in which the 
substitution of P with Q changes the external behaviour (defined via O) of the 
context. Formally, P and Q are distinguishable iff 3C[.j : 0(C[P]) yf 0{C[Q]). 
We put: 

P = Q P and Q are not distinguishable under {0,Com). 

Then an equivalence relation = is fully abstract for (O, Com) if and only if: 

VP,Q: P^Q P = Q. 

In this paper, we consider both definite and normal programs, and we assume 
that the language (or vocabulary) in which programs are written is fixed. Namely, 
the Herbrand base B we refer to is determined by a set of function and predicate 
symbols that include all function and predicate symbols used in the programs 
being considered. We will consider (set-theoretic) union of programs (denoted 
by U) as the only composition operation. The observable behaviour of a logic 
program may be defined in different ways, depending on which aspects of the 
computation one is interested in looking at. In the case of definite logic programs. 
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a natural choice of the observables is the success set of a program [2,29,40]. The 
success set SS{P) of a program P is the set of ground atoms A such that 
P U A} has a SLD-refutation. Therefore we put: 

0{P) = SS{P). 

2.3 Equivalence of Definite Logic Programs 

A number of different equivalence relations for logic programs were studied and 
compared with one another in [30] . 

The equivalence relation induced by the immediate consequence operator 
T{P) is one of these equivalences. Namely two programs are equivalent if and 
only if they have the same T{P), that is, their immediate consequence operators 
coincide on every Herbrand interpretation. In [30] a syntactic notion of equiva- 
lence, subsumption equivalence was also introduced, and it was shown to coincide 
with the equality of T{P) functions on programs. Let C\ and C 2 be the definite 
clauses A ^ B and D ^ E, respectively. Ci is subsumed by C 2 if there is a 
substitution -d such that A = Dd and Ed C B. Two logic programs P and Q are 
subsumption equivalent if every clause of P is subsumed by some clause of Q 
and vice-versa. Existing algorithms [25] can be therefore exploited to determine 
whether two programs are T(P) equivalent. 

Another equivalence relation considered in [30] is a refinement of subsump- 
tion equivalence, named weak subsumption equivalence. Namely two programs 
are weakly subsumption equivalent if and only if the two programs without 
tautologies are subsumption equivalent. As for the previous case, an equivalent 
formulation is given in terms of a refinement of the T{P) semantics, defined by 
means of a T(P) + Id function. 

Furthermore, logical equivalence {\= P < — > Q) and the corresponding equiv- 
alence when only Herbrand models are considered (M{P) = M{Q)) are studied. 
It is also shown that these two equivalent relations can be equivalently formu- 
lated in terms of the functional semantics defined in [27]. 

Finally, the standard equivalence relation induced by the operational seman- 
tics of logic programs is considered, which identifies programs with same success 
set (SS{P) = SS{Q)), and the latter coincides with the least Herbrand model 
semantics [40]. 

Different formulations of equivalence are also compared in terms of their 
relative strength. In addition to the previously mentioned correspondences, it 
is shown that subsumption equivalence is strictly finer than weak subsumption 
equivalence, which is in turn strictly finer than logical equivalence, which is in 
turn strictly finer than operational equivalence. 

Some of the results presented in [30] are summarised in Figure 1, where an 
arrow from =1 to =2 denotes that =1 is strictly finer than =2 (viz., =iC= 2 ). 

3 Composition of Definite Programs 

The union of programs is the most basic composition operation over logic pro- 
grams. Actually, every logic program consists of the union of all its clauses. 
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P s-e Q 
P w s-e Q 

SS{P) = SS{Q) 



T(P) = T(Q) 

i 

T{P) + Id = T{Q)+Id 

i 

M{P) = M{Q) 

I 

LHM{P) = LHM{Q) 



Fig. 1. Equivalence hierarchy for logic programs. 



The starting point of our analysis is the observation that the standard (model- 
theoretic, fixpoint or operational) semantics of logic programs is not composi- 
tional w.r.t. the union of programs. 

The least Herbrand model is usually taken as the the intended meaning of a 
definite logic program. Unfortunately, the least Herbrand model of the union of 
two programs cannot always be determined from the least Herbrand models of 
the separate programs. 

Example 1. For instance the program: 

fallible{x) ^ human{x) 

is equivalent to the empty program, as the empty set is the least model of both 
programs. On the other hand, if these programs are composed with the program: 

human{s aerates) ^ 

we obtain two programs which have different least models ( {human(socrates), 
faUible(socrates)} and {human{socrates)}, respectively ). 0 

The above example shows that the least Herbrand model semantics is not com- 
positional w.r.t. the union of programs. The same observation applies to the 
standard least fixpoint and to the standard operational semantics, as these three 
semantics are all equivalent for definite programs [40] . 

A number of different compositional denotational semantics for logic pro- 
grams have been proposed. In the next sections we will present some of those 
semantics, and analyse the existing relations among them. 



3.1 Subsumption Equivalence 

One of the first compositional semantics for logic programs was presented in 
[31]. Intuitively speaking, the idea of [31] is to adopt a higher-order semantics 
in order to achieve a compositional denotation of programs. Simply stated, a 
program P is denoted by its immediate consequence operator T{P) rather than 
by the least fixpoint of T{P), as done in the standard least fixpoint semantics 
of logic programs. 
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Indeed, the immediate consequences of the union of two programs can be 
determined by the immediate consequences of the two programs in the following 
way [31]: 

T(PUQ)(/)=T(P)(/)UT(g)(/) 

(where abusing notation the U on the left-hand side denotes program union while 
the U on the right-hand side denotes set-theoretic union). 

Let us denote by =t the equivalence relation induced by the T{P) semantics: 

P=^Q ^ T{P) = T{Q), 

Namely two programs are equivalent if and only if their immediate consequence 
operators coincide on every Herbrand interpretation. The equivalence relation 
=T is a congruence for the union of programs (as well as for several other in- 
teresting composition operations, as shown for instance in [7]). Moreover, the 
equivalence relation =t preserves the observables since: 

0{P) = SS{P) = LHM{P) = T“(P)(0). 

The equivalence relation =t is hence a congruence for (0,{U}). It is how- 
ever easy to observe that =t is not fully abstract for (O, {U}). Indeed there 
are programs which are not subsumption equivalent, though they cannot be 
distinguished operationally. 

Example 2. Consider for instance the programs: 

P Q 

a ^ a ^ b 

b^ 

We see that P and Q are indistinguishable under (O, {U}) though they are not 
subsumption equivalent — since T(P)(0) = {a, 6} while T(Q)(0) = {5}. 0 

3.2 Weak Subsumption Equivalence 

The weak subsumption equivalence relation was introduced in [30] as a refine- 
ment of subsumption equivalence. Two programs are weakly subsumption equiv- 
alent if and only if the two programs without tautologies are subsumption equiv- 
alent. 

Weak subsumption equivalence can be characterised in terms of the function 
T{P) by introducing a new operator {T{P) + Id) defined as follows: 

{T{P) + Id){I) = /UT(P)(/) 

and then by proving that [30] : 

P is weakly subsumption equivalent to Q (T{P) + Id = T{Q) + Id), 

Let us denote by =x+id the equivalence relation induced by T{P) + Id, that is: 



P =T+Id Q 



{T{P) + Id = T{Q) + Id). 
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As shown in [12], =T+id is a congruence for the union of programs. Indeed, for 
any interpretation I : 

U P 2 ) + Id){I) = / U T(Pi U P 2 )(/) = / U T(Pi)(/) U T{P2){I). 
Therefore, if P\ =T+/d Qi and P 2 =T+id Q 2 then for all I: 

(t(PiUP2) + m)(/) = (T(giug2) + /d)(/). 

Since the equivalence relation =T+id preserves O [30] it is hence composi- 
tional for {O, {U}). Weak subsumption equivalence is coarser than subsumption 
equivalence, that is =T+id distinguishes less programs than =t- 

Example 3. For instance =t distinguishes the programs: 

P Q 

a ^ b a ^ b 

b^b 

(since T{P){{b}) C T(g)({6})) while they are equivalent under =T+id- Indeed 
programs P and Q are identical up to tautologies and for each I: 

(T(P) + H)(/) = (r«3) + H)(/) = |]^,^, 

0 

As for the case of =t, we can however observe that =t+m is not fully abstract 
for (O, {U}). For instance, programs P and Q of Example 2 are indistinguishable 
under {O, {U}) though they are not weak subsumption equivalent since {T{P) + 
Id){{}) = {a, 6} and (T(g) + Id){{}) = {b}. 

3.3 Logical Equivalence 

While subsumption equivalence and weak subsumption equivalence are both 
compositional for (O, {U}), they are not fully abstract for (O, {U}) since they 
both distinguish programs that are instead operationally indistinguishable. If 
we look for a fully abstract denotation of programs, we must then consider 
some weaker equivalence relation over programs. The natural next candidate to 
examine, following the hierarchy of Figure 1, is logical equivalence. 

In the case of logic programs, logical equivalence coincides with the equiv- 
alence induced by the set of (all) Herbrand models of a program. Two definite 
programs are logically equivalent if and only if they have the same Herbrand 
models. If we denote by M{P) the set of Herbrand models of a program P: 

M{P) = {I \ I^P} 

then logical equivalence can be denoted as follows: 



P =M Q 



M{P)=M{Q), 
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The reason why the least Herbrand model semantics does not properly cope 
with program composition derives from the underlying Closed World Assump- 
tion (CWA) [39]. According to the CWA, and to the corresponding completion 
semantics [15], a logic program is interpreted as a complete knowledge specifica- 
tion. Such an interpretation does not reflect the implicit assumption underlying 
program composition, that is, that a program is an incomplete chunk of knowl- 
edge to be possibly completed with other knowledge. As a consequence, each 
program cannot be simply denoted by its least Herbrand model, where only 
provable formulae are considered. Also non-minimal Herbrand models of a pro- 
gram must be considered, including formulae not provable in the program, but 
which can possibly become provable after some program composition. 

In this perspective, a compositional semantics of logic programs was defined 
in [11] by denoting a program with the set of all its Herbrand models. Indeed the 
Herbrand models of the union of two programs can be determined by the Her- 
brand models of the separate programs, as shown by the following observation. 



Observation 1 Let P and Q be definite programs. Then: 

/GM(PUQ) ^ I G M{P) A I G M{Q), 

Namely an interpretation / is a model of the union of two programs if and only 
if / is a model of both programs. Therefore the set of models of the union of 
two programs coincides with the intersection of the set of models of the two 
programs, that is: 

M(Pug) =M(P)nM(Q) 

and the least Herbrand model of the union of two programs is hence the least 
Herbrand interpretation which is a model of both programs. 

Logical equivalence preserves the least Herbrand models semantics =lhm, 
since 0{P) = P| {/ | / € M(P)}, and hence logical equivalence is compositional 
for {O, {U}). This means that if two programs are logically equivalent then they 
are also operationally indistinguishable, that is, they exhibit the same observable 
behaviour in all possible contexts. 

Differently from subsumption equivalence and weak subsumption equiva- 
lence, logical equivalence is fully abstract for {O, {U}). Indeed, as proved in [12], 
programs which are indistinguishable w.r.t. {O, {U}) are also logically equiva- 
lent. It is perhaps worth recalling here a sketch of the proof of the full abstraction 
of logical equivalence reported in [12]. 

The proof shows that if two programs P and Q are not logically equivalent, 
then there exists a context in which they exhibit different observational 
behaviour. By definition of logical equivalence, if P Q then there exists 
an interpretation I such that I G M{P) and I ^ M{Q). By definition of 
Herbrand model [29], this means that T{P){I) C / and T{Q){I) % I. This 
implies that there exists a finite subset F of I such that A G T{Q){F) while 
A ^ F, for some atom A. The proof is finally concluded by considering 
the program R = {B B G F} and by showing that A ^ 0(PUi?) while 
A G 0{QUR). 
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Example 4- Consider for instance the programs: 

P Q 

a(x) ^ c(x) a(x) ^ b(x) 

b(x) <— c(x) b(x) <— c(x) 

Any interpretation of the form: 

I = {a{t) \t € T}U {b{u) \ u G U} 

(where T and U are — possibly infinite — sets of ground terms such that T C U) 
is a model for P and not for Q. Following the above proof sketch, we observe that 
there exists a finite subset F oi I (for instance F = {b{k)} for any k G U — T) 
such that a{k) G T{Q){F) and a{k) ^ F. If we then consider the program: 

R 

b{k)^ 

we see that a{k) ^ 0{P U R) while a{k) G 0{Q U i?) . 0 

The results proved in [12] establish that logical equivalence is the fully abstract 
compositional equivalence relation for (O, {U}). This means that Her brand mod- 
els induce the coarsest equivalence relation on programs w.r.t. {O, {U}), in that 
any other denotation of programs one may choose must induce the same equiv- 
alence relation to be compositional and fully abstract. 



3.4 Admissible Models 

Before the full abstraction of logical equivalence was established, a different 
compositional model-theoretic semantics for definite programs was presented 
in [10]. The idea of [10] was to model the composition of definite programs 
by denoting each program with a subset of its Herbrand models, called the 
admissible Herbrand models. Intuitively speaking, a model is considered to be 
admissible if it is “supported” by a set of hypotheses which all occur in the bodies 
of the program clauses. The intuition behind the admissible model semantics is 
to consider only those Herbrand models which somehow denote the effects of the 
possible compositions of a program with other programs. 

It is worth observing that each Herbrand model is “supported” by the as- 
sumption of a set of hypotheses. 

Lemma 1. Let P be a program and let I B. Then: 

IgM{P) 3H CB : I = LHM{PLi H), 

Following [10], a set of admissible hypotheses is formally defined as follows: 

A set H C B is an admissible set of hypotheses for a program P if and 
only if for all h G H there exists a ground instance B of a clause in 
P such that h G B. 
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The set possible admissible hypotheses for a program P is hence defined as 
follows: 

AH{P) = {h\h £ B f\ 3A, B : {A^ B G ground{P) Ah G B)}, 

The notion of admissible model is then defined as follows: 

Let P he a program, let I C B, and let H C AH{P). Then I is an admissi- 
ble model for P under the hypotheses H if and only if I = LHM(PUH). 

A model I that is admissible under the set of hypotheses H is denoted by I{H) 
to explicitly record the set of hypotheses supporting it. The set of admissible 
models AM{P) for a program P is hence defined as follows: 

AM{P) = {I{H) \ I CB A H C AH{P) A I = LHM{P U H)}. 

It is easy to observe that the least Herbrand model is always an admissible model 
(under the empty set of hypotheses) . 

Example 5. Consider for instance the program P: 

P 

a ^ b 
c ^ 

which has three Herbrand models: {c}, {a,c}, and {a,b,c}. Since b is the only 
admissible hypothesis for P, there are only two admissible models for P: {c} — 
admissible under the empty set of interpretations — and {a,b,c} — admissible 
under the set of interpretations {5}. The model {a,c} is instead considered not 
admissible since there is no admissible set of hypotheses supporting it. 0 

As shown in [10], admissible models define a compositional semantics for 
logic programs. Indeed the admissible models of the union of two programs can 
be determined by composing the admissible models of the separate programs. 
Such composition is defined in [10] by means of a T(5) operator which, given 
a set S of admissible models, maps Herbrand interpretations into Herbrand 
interpretations. Intuitively, the definition of T(S') lifts the definition of T{P) 
from program clauses to program (admissible) models. Namely, T{P){I) yields 
the union of all the atoms A such that P contains an implication “A if H” whose 
premise B is true in the interpretation I. Similarly, P{S){I) yields the union of 
all the conclusions J such that J{H) is an admissible model whose premise H 
is true in /. 

The equivalence induced by the admissible models semantics is defined as 
follows: 

P=amQ ^ AM{P)=AM{Q), 

Namely P =am Q if and only if the set of pairs {I,H) such that I{H) is an 
admissible model is the same for both programs. It is easy to observe that the 
equivalence relation =am preserves the observables O since 

LHM{P) = P|{/ I I{H) e AM{P)} 

and hence =am is compositional for {O, {U}). 
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We can however observe that =am is not fully abstract for (0,{U}). In- 
deed there are programs which do not have the same set of admissible models, 
although they cannot be distinguished operationally. 

Example 6. For instance the programs: 

P Q 

a ^ b a ^ 

b ^ b ^ a 

are indistinguishable under (O, {U}) though they do not have the same set of ad- 
missible models. Indeed P has two admissible models, {a, 6}(0) and {o, 6}({6}), 
while Q has the admissible models {a, 6}(0) and {a, 6}({a}). 0 

The above example also highlights that programs having the same Herbrand 
models may have different sets of admissible models. Indeed this per se shows 
the non-fully abstractness of =am once the full abstraction of logical equivalence 
has been established. 



3.5 Minimal Admissible Models 

In Section 3.3 we have shown that the set of Herbrand models induces a fully 
abstract compositional equivalence relation (viz., logical equivalence =m)- In 
the previous section we have shown that the idea of considering only admissible 
Herbrand models yields a compositional equivalence relation (viz., =am) which 
is however not fully abstract. 

An intriguing question is whether it is possible to refine the notion of admis- 
sible Herbrand model so as to restrict the set of admissible models of a program 
and to obtain a fully abstract denotation of programs. 

Example 6 reported at the end of the previous section shows that the set of 
admissible models of a program includes models that are somehow “redundant” 
in view of possible program compositions. 

Example 7. Consider again program P of Example 6: 

P 

a ^ b 
b^ 

We observe that the inclusion of the admissible model {a, 6}({6}) does not really 
add much information to the program denotation, given the presence of the 
model {a, 6}(0). Intuitively speaking, the possible effects of the hypothesis b 
becoming true (because of some program composition) are already denoted by 
the admissible model {a, 6}(0). 0 

Following the above observation, the definition of admissible model may hence be 
refined so as to exclude the somehow “redundant” models. Intuitively speaking, 
we might consider I{H) to be a “minimal” admissible model only if El is the 
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minimal set of hypotheses needed to derive the set of conclusions /, that is, only 
if: 

yK C H : LHM{P \J K) C LHM{P U H) 

This constraint would eliminate some redundant admissible models — e.g., 
the indistinguishable programs P and Q of Example 6 would now have {a, 6}(0) 
as the only admissible model. 

A stronger constraint is however needed to avoid all redundant models. 
Example 8. For instance the program: 

P 

a ^ a 

would still have two admissible models (0(0) and {a}({a})) while being oper- 
ationally indistinguishable from the empty program. Intuitively speaking, the 
information contained in the model {a}({a}) for P is somehow already con- 
tained in the model 0(0), since the addition of the new hypothesis a does not 
add any other conclusion besides itself. 0 

We therefore say that an admissible model I{K U A) is not redundant w.r.t. 
another admissible model J{K) only if the extra hypotheses A add some other 
conclusions besides themselves, that is, only if {I — J) D A. Formally, we define 
the set of minimal admissible models for a program P as follows: 

p,AM{P) = {I{H) \ I C B A H C AH{P) A / = LHM(PUH) 

AyK C H : LHM{P \J H) - LHM{P \JK)Z) H -K), 

Let us consider a simple example in order to better illustrate the way in 
which minimal admissible models restrict admissible models. 

Example 9. Consider the program: 

P 

a ^ b,c 
c ^ d,c 

which has eight admissible models: 

{6}(0) 
mm 
{6, 4({4) 

{6, d}({b, d}) 

Remarkably only two of such models are minimal admissible models, that is: 
p,AM{P) = {{6}(0),{a,6,c}({c})}. 

Indeed the model {6}({6}) is redundant w.r.t. |6}(0) since LHM{PU {6}) = 
LHM{P). In other words the addition of the hypothesis b does not add any 



{a,b,c}{{c}) 
{a,6,c}({6, c}) 
{a,b,c, d}({c, d}) 
{a,6,c, d}{{b, c, d}) 
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new conclusion. The models {6, d}({fi}) and {b,d}{{b,d}) are redundant w.r.t. 
{6}(0) too, since in both cases LHM{PU H) — LHM{P) = {d} 7 ! {d}. Again, 
the addition of the set of hypotheses {d} or {b, d} does not add any other 
conclusion besides the hypotheses themselves. Similar considerations apply to 
the other non-minimal admissible models {a, b, c}({6, c}), {a, b, c, d}({c, d}), and 
{a, 6, c, d}({6, c, d}) which are all redundant w.r.t. {a, 6, c}({c}). 0 

The following proposition shows that for each admissible model I{H) there exists 
a minimal admissible model J{K) supported by a smaller set of hypotheses and 
such that LHM{P U H) = LHM{P U K)U{H - K). 

Proposition 1. Let P he a program, let I C B and let H C B. 

I{H) gAM{P) 3K,J:{KCHA J{K) G fxAM{P) A I = JU(H-K)), 

We finally prove that the equivalence relation =^ am , induced by the set of 
minimal admissible models of a program, does coincide with the fully abstract 
equivalence relation =m- To simplify the equivalence proof, we first provide the 
following alternative characterization of logical equivalence. 

Lemma 2. Let P and Q be two programs. Then: 

P=mQ WH C B : LHM(PUH) = LHM{QU H), 

We are now ready to establish that the equivalence relations =^am and =m 
coincide. 

Proposition 2. =m = =tiAM- 

To conclude our analysis of admissible models, let us reconsider Example 9 to 
illustrate the relation between Herbrand and (minimal) admissible models. 

Example 10. Consider again the program: 

P 

a ^ b,c 
b^ 

c ^ d,c 

If ,8 = {a,b,c,d} then the set of Herbrand, admissible and minimal admissible 
models of program P are, respectively: 

HM{P) AM{P) nAM{P) 

MW) MW~ 

{b}{{b}) 

{a,b} 

{b,d} {b,d}{{d}) 

{b,d}{{b,d}) 

{a,b,c} {a,6,c}({c}) 

{a,b,c}{{b, c}) 

{a,b, d} 

{a,b,c,d} {a,b,c,d}{{c,d}) 

{a,b,c, d}{{b, c, d}) 



{a,b, c}({c}) 
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It is worth noting that since fiAM{P) = {{6}(0), {a, b, c}({c})} then the min- 
imal admissible model semantics (correctly) identifies the above program P for 
instance with the program: 

Q 

a ^ c 

whose Herbrand, admissible, and minimal admissible models are, respectively: 

HM{Q) AM{Q) fxAM{Q) 

{ 6 } { 6 }( 0 ) { 6 }( 0 ) 

{a,b} 

{5, d} 

{a, b, c} {a, 6, c}({c}) {o, &, c}({c}) 

{a,b, d} 

{a, 5, c, d} 

We see that while the admissible model semantics distinguishes P and Q, the 
two programs are identified both by the minimal admissible model semantics 
{fiAM{P) = fj.AM{Q)) and by logical equivalence {HM{P) = P[M{Q)). 0 



3.6 Supported Interpretations 

We finally introduce an alternative characterization of logical equivalence, which 
is defined by means of the notion of supported interpretation originally introduced 
in [9]. 

Besides providing another equivalent formulation of logical equivalence, the 
notion of supported interpretation will be exploited in the following sections to 
define a compositional semantics for extended logic programs, such as programs 
containing negation. 

Intuitively speaking, a (Herbrand) interpretation / for a program P is sup- 
ported by a set of hypotheses iL if / is the least Herbrand model of the program 
P extended with H . More precisely, I is an interpretation for P supported by H 
if and only if I is the least Herbrand model of the program PU H, where PU H 
stands for P U {h h € H}. 

If the sets of hypotheses to be considered are arbitrary subsets of the Her- 
brand base B, we obtain the following definition of supported interpretation: 

Let P be a program, and let I C B, PI C B. Then I is an interpretation 

for P supported by PI if and only if I = LHM{P U H). 

An interpretation / supported by a set of hypotheses H is denoted by I{PI) to 
explicitly record the set of hypotheses supporting it. 

The set of supported interpretations for a program P is then defined as 
follows: 



SI{P,B) = {I{H) \ H C B AI = LHM{PU H)}, 
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As pointed out in [9], the above notion of supportedness properly extends the 
notion of admissibility introduced in Section 3.4. Namely the admissible models 
of a program are in general a subset of its supported interpretations, that is, 
AM{P) C SI{P,B). 

Lemma 1 shows that there is a direct correspondence between the supported 
interpretations and the Herbrand models of a program. Namely for each pro- 
gram P: 

/ e M(P) ^ 3H : I{H) G SI{P,B). 

Let us denote by the equivalence relation induced by the set of supported 

interpretations of a program, that is: 



P=si(B)Q ^ SI{P,B) = SI{Q,B). 

We now establish that the equivalence relation =si(b) coincides with logical 
equivalence =m- 

Proposition 3. =m = =si(B)- 

In the following sections, we will use a more general definition of supported 
interpretation by considering a set H of assumable hypotheses, where H is some 
pre-defined subset of the Herbrand base B (viz., H C B). Namely the set of 
supported interpretations of a program P w.r.t. a set of assumable hypotheses 
H is defined as follows: 

SI{P,n) = {I{H)\H QHM = LHM{P\J H)}, 

Notice that the choice of the set of assumable hypotheses affects the induced 
equivalence relations as pointed out by the following proposition. 

Proposition 4. Let H Q B, K. Q B be two sets of assumable hypotheses. Then: 

(1) H K. =si{k:) C =si(h) 

(2) Tt C K. =siiJC) C =si{H) 

3.7 Summary 

In the previous sections, we analysed different compositional semantics for def- 
inite logic programs that have been proposed in the literature. The relations 
between the semantics considered are summarized in Figure 2. 

We have first considered three equivalence relations analyzed in [30]: sub- 
sumption equivalence (=t), weak subsumption equivalence (=T+/d), and logical 
equivalence (=m)- We have shown that they are all compositional w.r.t. {O, {U}), 
and while the first two semantics are not fully abstract w.r.t. (0,{U}), logical 
equivalence is the fully abstract compositional equivalence w.r.t. (O, {U}). 

We have then considered a different equivalence relation (=am) defined in 
terms of the admissible Herbrand models of a program. We have shown that 
while admissible models induce a compositional denotation of programs, the 
corresponding equivalence relation is not fully abstract w.r.t. {O, {U}). We have 
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then shown how the notion of admissible model can be suitably constrained so 
as to obtain a fully abstract denotation of programs. The new equivalence rela- 
tion =fj_AM, induced by the so-called minimal admissible models of a program, 
coincides with the fully abstract equivalence relation =m- 

Finally, we have introduced the notion of supported interpretation that will 
be used in the following sections. We have shown that supported interpretations 
provide an alternative definition of logical equivalence, that is, the induced equiv- 
alence relation = 5 /(g) coincides with the fully abstract compositional equivalence 
relation =m- 

The relations between the semantics considered are summarized in Figure 2, 
where a double arrow from =a to =b denotes that =a coincides with =b, while 
a single arrow from to =s denotes that =a is strictly finer than =b, that 
is, =aC=b- 



T{P) = T(Q) 

1 

T{P) + Id = T{Q) + Id) AM{P) = AM{Q) 

I f 

SI{P, B) = SI{Q, B) < — > M(P) = M{Q) < — > fiAM{P) = fiAM{Q) 

I 

LHM{P) = LHM(Q) 



Fig. 2. The new equivalence hierarchy. 



The chain of inclusions =t C =T+id C =m C =lhm was established in [ 30 ]. 

The strict inclusion =amC= ^AM has been established in Section 3 . 5 . In- 
deed the minimal admissible models fxAM(P) of a program P are obtained from 
the admissible models AM{P) of P. It is hence easy to show that =am‘^=iiAM, 
that is, if P =AM Q then P =^am Q- Moreover, as shown in Section 3 . 5 , 
=fiAM^=AM, that is, there exist programs that have the same set of minimal 
admissible models, while they have different sets of admissible models. For in- 
stance the empty program and the program P of Example 8 are distinguished by 
the admissible models semantics, since AM{P) = {0(0), {a}({a})}, while 0(0) is 
the only admissible model for the empty program. On the other hand the two 
programs are identified by the minimal admissible model semantics in that 0(0) 
is the only minimal admissible model for both programs. 

It is worth noting that while =am is strictly finer than =^am (and hence 
than =M and =si(b)), =am is not comparable with either =t or =T+id- Indeed 
there are programs which are (weak) subsumption equivalent and not admissible 
interpretations equivalent, and vice-versa. 
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Example 11. For instance consider the programs: 

P Q 

a ^ a ^ 

a ^ a 

We observe that P =t Q (since V/: T(P)(/) = {a} = T{Q){I)) and P =T+id Q 
(since V/: (T + Id){P){I) = {a} U / = (T + Id){Q){I)), while P ^am Q since 
{o}({o}) G AM{Q) — AM{P). On the other hand, if we consider the programs: 

P Q 

a ^ a ^ b 

b^ 

c ^ b c ^ b 

we see that P =am Q since AM{P) = AM{Q) = {{a, 6, c}(0), {a, &, c}({6})}, 
while P Q and P ^r+id Q (since T(P)(0) = {o, 6} and T{Q){%) = {6}). 0 

In this section, we have examined a number of compositional semantics for defi- 
nite programs. However, as already anticipated in the Introduction, our analysis 
is not intended to be exhaustive. Other semantics have been proposed in the 
literature, such as [27,33] which were then extended by [31]. A survey describing 
these and other modular extensions of logic programming can be found in [14]. 

Moreover, our analysis focusses on standard Herbrand interpretations [2,29]. 
Many efforts have been devoted to define compositional semantics by using ex- 
tended interpretations containing possibly non-ground atoms (e.g., see [6,22]). 
The relation between these semantics and the semantics based on standard Her- 
brand interpretations is summarised in [1]. 

4 Composition of Normal Programs 

The need of extending definite programs to deal with forms of non-monotonic 
reasoning was recognized since the early years of logic programming. Negation 
as failure was introduced in [15] to express negative information, and it has been 
shown to support various forms of non- monotonic reasoning. A number of other 
extensions have been then proposed to further enrich the expressive power of 
logic programming as a general formalism for knowledge representation (see [4] 
for a survey). 

The formalization of these extensions has called for new semantics capable to 
capture their “intended” meaning. Even for the case of negation as failure, many 
different characterizations have been defined from different perspectives, most of 
them inspired by an interpretation of negation as failure as a more general notion 
of negation by default (e.g., [19]). A survey of the semantics of logic programs 
with negation (by default) is reported in [3]. 

Something similar happened for other extensions such as abductive logic 
programming (e.g., [17,19,26]) and logic programming with a second form of 
negation in addition to negation by default (e.g., [24,34,38]). 




On the Semantics of Logic Program Composition 133 



For each extension there is no general agreement on what its semantics should 
be. Formal comparisons among different semantics are hard to be drawn, mainly 
because they are often based on different grounds. Furthermore, many proposals 
are based on a proof-theoretic approach rather than on a model-theoretic ap- 
proach, and this constitutes a further obstacle for the study of formal properties, 
and hence formal comparisons, of different proposals. 

On these premises, analysing compositionality issues in extended logic programs 
seems to be a quite difficult enterprise since: 

— Many semantics have been proposed for different extensions of logic pro- 
gramming, and there is no general agreement on the intended meaning of 
each extension. 

— Non-monotonic reasoning and compositionality are intuitively orthogonal 
issues that do not seem easy to be reconciled. Indeed the semantics for ex- 
tended logic programs are typically non-compositional w.r.t. program union. 

Consider for instance the case of normal logic programs, that is, logic programs 
with negation as default. It is easy to show, for instance, that the stable model 
semantics [23] is not compositional with respect to the union of programs. 

Example 12. Consider for instance the programs: 

P Q 

a ^ a ^ not b 

which have the same (unique) stable model {a}. If these programs are extended 
with the program: 

R 

we obtain two programs which have different stable models ({a, 6} and {6}, 
respectively). Therefore it is not possible, in general, to determine the stable 
models of a program from the stable models of its clauses. 0 

A unifying view of different extensions of logic programming was presented in 
[9] , where it is shown how the meaning of various extensions of logic programming 
can be expressed by means of the supported interpretations of a program. Many 
extensions are considered in [9], including negation-by-default, other forms of 
negation and abduction. 

The approach can be summarised as follows: 

1. Given an extended logic program, construct its “positive” version. 

2. Consider the set of supported interpretations of (the positive version of) the 
program. 

3. Select among the supported interpretations the complete models which char- 
acterise the intended meaning of a program. 
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The interest of complete models derives from their correspondence relation with 
other models proposed in the literature. Such a correspondence makes the sup- 
ported interpretation approach a general framework for characterising and com- 
paring different semantics of various extensions of logic programming. 

In this paper we will focus only on one of these extensions, negation- by-default, 
and on the corresponding class of normal programs. 



4.1 Positive Version of a Normal Program 

A normal program P is a set of clauses of the form 

A < — Li , . . . , (n > 0) 

where A is an atom and Ti, . . . , are literals. Negated literals in clause bodies 
have the form not B, where B is an atom. In the following, without loss of 
generality, we will consider only (possibly infinite) propositional programs. A 
non-propositional program is then understood as a shorthand for the (possibly 
infinite) set of ground clauses obtained by instantiating the original rules in all 
possible ways over the Herbrand universe. 

Following [18,38], the positive version of a normal program P is the 
definite program obtained by replacing in P each negated atom not A by a new 
positive atom not-A. The Herbrand base associated with is then the set 
obtained by extending the Herbrand base B of P with the new set of atoms: 

notJ3 = {not^A \ A & B} 



that is: 

B+ = {B\JnotJ3). 

The intended meaning of P will be defined by suitably restricting the Herbrand 
models of which are subsets of the extended Herbrand base. 

From now onward, we will not distinguish any further between a normal 
program P and its positive version P~^ , that is we will denote P directly by its 
positive version. 



4.2 Supported Interpretations for Normal Programs 

Let us take the negative part notJS of the extended Herbrand base B^ as the 
set of assumable hypotheses. We get the following definition of supported inter- 
pretations of a program P\ 

SI{P,not.B) = {I{H) I HCnotJS A I = LHM{PU H)}, 

As shown in [9], the set of supported interpretations can be suitably restricted 
in a step-wise way so as to identify the set of complete models of a program. 
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Such a step-wise process can be summarised as follows: 

~ A supported interpretation I{H) of a program P is a supported model of P 
if / is consistent (i.e., : A G I A not -A G I). 

— A supported interpretation J{K) is a conservative extension of a supported 
model I{H) if and only ii K A H and {^A : not-A G K A A G I). 

— A supported model I{H) is a complete model of P if and only if Va G B: 

notM G H {a ^ J for each conservative extension J{K) of 1(H)), 

— Each normal programs has at least one complete model. 

Example 13. Consider for instance the program: 

P 

a ^ not b 

and suppose for the sake of simplicity that B = {a,b}. Then the supported 
interpretations SI(P,notJ3) are: 

0 ( 0 ) 

{not_a} {{notji}) 

{a,notJ)\ {{notJ)}) 

{a,not-a,notJ)\ {{notji,notJ}\) 

It is easy to observe that each supported interpretation of P is a conservative 
extension of the model supported by the empty set of hypotheses. On the other 
hand, there is no conservative extension of the model {a, not_6}({not_6}) since its 
only extension {a,notji,notJ}\({notji,notJ}\) is not conservative with it (viz., 
notja is an hypothesis of the latter while a belongs to the model of the former) . 
We can therefore observe that {a,notJ>}{{not.b}) is the only complete model 
for P. 0 

Notably these complete models have a tight relation with other models pro- 
posed in the literature. In the case of normal programs, complete models have 
been shown in [9] to correspond to: 

— stable models semantics [23], 

— well-founded semantics [41], 

— stationary semantics [38], and 

— preferential semantics [17]. 

For instance, in [9], it is shown that: 

Let P he a normal program and let M C B. Then M is a stable model of 
P if and only if M Li {not-A \ A ^ M} is a total complete model for P. 
(An interpretation M is total iff for each A G B: A G M or not -A G M .) 

Similar correspondences are established with well-founded models [41], with 
stationary expansions [38], and with complete scenaria [17]. 

These correspondences can be equivalently described in the following way. For 
each semantics S considered, there exists a suitable projection function ips which. 
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given the set of supported interpretations SI{P'^ ,notJ3) of (the positive version 
of) a program P, yields the models S{P) of the corresponding semantics S 

tps ■■ SI{P+,not^) ^ S{P). 

In terms of program equivalence, this means that the supported interpreta- 
tions semantics preserves all the semantics which have been considered in [9]. 
Indeed for any such semantics S: 

=SIiuotJ3) P=SQ. 



4.3 Compositionality 

As the supported interpretations semantics preserves many different meanings 
of programs, the compositionality of the induced equivalence relation =si(notJ3) 
would be of major importance. Indeed it would allow the substitution of parts of 
normal programs without affecting the meaning of the whole program, for any 
meaning considered. 

Unfortunately, the equivalence relation =si(notJ 3 ) is not a congruence w.r.t. 
the union of programs. 

Example 14- Consider for instance the program 
P 

a ^ b 

and let Q be the empty program. While P =si{notJB) Q there exists a program 
R such that P LI R ^si(not_B) Q L R. For instance, consider the program 

R 

b ^ not c 

It is easy to see that SI{P U R, notJS) yf SI{QL R, notJS) since the supported 
interpretations {a, b, not_c}({not_c}) belongs to the former and not the latter. 0 

By exploiting Proposition 4 it is however possible to establish a compositionality 
result for normal programs. Indeed since notJ3 C we have that: 

= SI{B+) C =SI(notJ3) 

and by the compositionality of logical equivalence we obtain the following result. 

Propositions. =si(s+) is compositional for {= si (not_B)A'^})- 

Since the supported interpretations equivalence relation =si{notJB) preserves all 
the semantics for normal programs considered in [9], we have that if (the posi- 
tive versions of) two programs are logically equivalent then they have the same 
meaning S for each S considered in [9], that is they have the same complete 
scenaria [17], the same stationary expansions [38], the same stable models [23], 
and the same well-founded model [41]. 

Corollary 1. =si(b+) is compositional for (=s , {L}) , for all S considered in [9] . 
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4.4 Full Abstraction 

The compositionality of logical equivalence for (=s/(not_6)j {U}) states that if 
(the positive version of) two programs R and i?' are logically equivalent then 
they can be exchanged with one another without affecting the meaning S of the 
context in which they occur. 

An intriguing question is whether or not this is the largest class of programs 
which can be substituted one another without affecting the intended meaning S 
of the context in which they occur. 

As the equivalence relation =si(notJ 3 ) preserves several semantics for normal 
programs, it is interesting to determine whether logical equivalence is fully ab- 
stract for (=si(not_B)i {U})- This is exactly what we establish here for the first 
time with the following proposition. 

Proposition 6. =si(b+) is fully abstract for {= si (not j3)A^})- 

It is worth noting that the above result states that logical equivalence defines 
the largest class of programs which can be substituted one another without 
affecting the set of negatively supported interpretations of the context in which 
they occur. The full abstraction result is hence relative to the equivalence relation 
=Si{notJB)j and its importance is due to the fact that =si(notJ3) preserves a 
number of different intended semantics for normal programs. On the other hand, 
Proposition 6 does not imply that logical equivalence is fully abstract for every 
intended meaning of normal programs. For instance, an equivalence relation 
coarser than logical equivalence (of the positive versions) may be fully abstract 
for the stable model semantics of normal programs. 

4.5 Summary 

In the previous sections, we have discussed the issue of compositionality for the 
case of normal logic programs. As we observed at the beginning of Section 4, the 
two main problems of designing of a tour for analyzing compositionality issues 
in normal logic programs were: 

(1) the existence of many different semantics for normal programs (no universal 
agreement on the intended meaning of a normal program, as it happens 
instead for the case for definite programs), and 

(2) the orthogonality of negation and compositionality (existing semantics for 
normal programs are typically not compositional). 

Following the steps of [9], we have observed that supported interpretations pro- 
vide a unifying model-theoretic characterizations of a number of semantics for 
normal programs. The idea is to consider the positive version of programs and 
to take notJS as the universe of assumable hypotheses. The induced equiva- 
lence relation =si{notJ3) then preserves the stable models semantics [23], the 
well-founded semantics [41], the preferential semantics [17], and the station- 
ary semantics [38]. While =si(not_B) is not a congruence for the union of pro- 
grams, the equivalence relation =si(b+) is compositional for (=s/(„ot_6)j {U})> 
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and hence =s/(g+) is compositional for (= 5 . ,{U}), for each semantics Si pre- 
served by =si(notJB)- Finally, as a new result, we have shown that = 5 /( 5 +) is the 
fully abstract equivalence relation for (=s/(„ot_ 6 )j {U})- The relations between 
the semantics considered in this section are summarised in Figure 3. 



SI{P+,B+) = SI{Q+,B+) 



SI{P+,notJB) = SI{Q+,notJ3) 



Si(P)=Si{Q) 52 (P)= 52 (Q) 53(F) =53(0) 54(F) = 54 (Q) 





Fig. 3. The equivalence hierarchy for normal programs, where 5i — ^4 are the 
semantics for normal programs considered in [9] . 



It is worth observing that the notion of supported interpretation can be ex- 
ploited to provide a compositional characterization of other extensions of logic 
programming, besides negation- by-default. Two main classes of extensions of 
logic programming are considered in [9] : 

— Other forms of negation — such as explicit negation [34], answer set seman- 
tics [24], and 3- valued stable semantics [37]; 

~ Abduction — as modeled for instance in [16] and in [26]. 

As shown in [9] , all these extensions can be provided with a uniform characteri- 
zation by means of supported interpretations by considering different universes 
of assumable hypotheses. Lack of space forces us to invite the reader to refer to 
[9] for more details. 

Several other efforts have been devoted to investigate the composition of normal 
logic programs. 

The splitting of a logic program into parts was investigated in [28] in the 
context of the answer set semantics [24]. The basic idea is that, in many cases, 
a program can be divided into a “bottom” part and a “top” part, such that the 
former does not refer to predicates defined in the latter. In [28] it is shown that 
computing the answer sets for a program can be simplified when the program 
is split into parts. It is also shown that the idea of splitting can be applied for 
proving properties of simple program compositions, like a conservative extension 
property for a program P extended by rules whose heads do not occur in P. 
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The problem of defining compositional semantics for normal logic programs 
was studied in [20] and in [42]. [20] defines a compositional semantics for nor- 
mal programs by means of a first-order unfolding operator. Such a semantics is 
applied for composing “open” normal programs by considering the union of pro- 
grams as the only composition operation. [42] presents several results on the com- 
positionality of normal logic programs by generalising the well-founded semantics 
of logic programming. More precisely, they identify some conditions under which 
the class of extended well-founded models of the union of two normal programs 
coincides with the intersection of the classes of extended well-founded models 
of the two programs. Another difference between the approaches presented in 
[20,42] and ours is the restrictive naming policy imposed on the predicate names 
of the programs to be composed. Namely, in contrast with our naming policy, 
predicate definitions cannot be spread over different programs. 

An even more restrictive naming policy is considered in [13] for defining a 
compositional model-theory for definite and disjunctive programs. The author 
investigates the usability of minimal logic for modeling the meaning of extended 
logic programs by allowing “local” inconsistencies. The compositionality results 
are however restricted to pairs of programs which define disjoint sets of predicates 
and which do not interact with each other. 

A compositional semantics for normal programs was also defined in [8], where 
a family of program composition operations is considered. The semantics of 
programs and program compositions is defined in terms of three- valued logic by 
extending the three-valued immediate consequence operator for logic programs 
proposed in [21]. 

Finally, a related work is [5], where a set of operations for composing logic 
programs are studied in the context of intensional negation. Negation is handled 
in a constructive way by transforming normal programs into pairs of definite 
programs and by defining the composition operations on such pairs. 

5 Other Forms of Program Compositions 

5.1 Program Extension 

The operation of union between programs can be employed to combine programs 
that fully cooperate in the deduction process. Namely one program may exploit 
partial conclusions of the other, and vice-versa. 

Such a symmetric composition-by-union is not however the only form of 
program composition employed in program development. For instance, programs 
often import the functionalities of some existing module or library, where the 
latter does not in turn rely on the former for its computation [28] . 

A typical example comes from deductive databases, where a set of (recursive) 
rules R is composed with an extensional database D defining the values of a set of 
relations and typically consisting of extensional clauses. An intriguing question 
is when R can be substituted with a different, possibly more efficient, set of rules 
R' without affecting the meaning of the whole system. 
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Let us then consider here this form of program composition, that we call 
program extension to distinguish it from the composition-by-union considered in 
the previous sections. Formally, rather than considering arbitrary expressions of 
the form: 

Exp :== P I Exp U Exp 

where P is a definite program, we now consider restricted compositions of the 
form: 

Exp :== P I Exp U,r D 

where D is an extension, that is, an extensional program consisting of unit 
clauses only. We may also assume that the predicates defined by extensions are all 
members of a pre-fixed set of predicates tt. The symbol U.^. is hence both to record 
the set 7T and to distinguish program extension from the general composition- 
by-union considered in the previous sections. According to the above syntax, we 
therefore consider expressions of the form: 

P, PU^D^, {PU^ D^)U^ D2, {{PU^D^)U^D 2 )^^D 3 , ... 

and so on and so forth. Notice that instead of introducing the new symbol U.^. 
we may simply consider the union operation U as a partial composition function 
which is defined only when its second argument is an extension. 

We now show that supported interpretations can be naturally employed 
to obtain a compositional denotation of programs also for this new form of 
composition-by-extension. Consider as the set of assumable hypotheses the fol- 
lowing set: 

Btt = {A I A e P A pred{A) G tt}. 

where pred{A) denotes the predicate symbol of the atom A. Namely is the set 
of atoms in the Herbrand base with predicates in the set tt. The corresponding 
set of supported interpretations therefore is: 

SI{P,B^) = {I{H)\H CB^ M = LHM{P\JH)} 

and the induced equivalence relation is: 

P=si(b^)Q ^ SI{P,B^) = SI{Q,B^). 

We observe that the equivalence relation = si ( b „) is strictly coarser than = si { b ) 
and strictly finer than = lhm ■ 

Proposition 7. = si { b ) C = si { b „) C = lhm - 

Moreover, the equivalence relation =si(b„) is compositional for ( 0 ,{U,r}), as 
stated by the following proposition. 

Proposition 8. =5/(5^) is compositional w.r.t ( 0 ,{U,r})- 
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It is perhaps even more interesting to observe that supported interpretations can 
be employed to define a compositional characterisation of program extension also 
in the case of normal programs. 

Indeed, consider expressions of the form: 

Exp :== P I Exp D 

where P is a normal program and D is an extension, that is, an extensional 
program consisting of unit clauses only. 

Then we simply consider as the set of assumable hypotheses the set: 

U notJS 

The corresponding set of supported interpretations therefore is: 

SI{P, U not^) = {I{H) \ HC{B^\J not^) A J = LHM{P UP)} 

and the induced equivalence relation is: 

P =si(B„unotj 3 ) Q SI{P,B^LlnotJ3) = SI{Q,B^LlnotJ3), 

It is easy to observe that =si(B„unotJ 3 ) is strictly coarser than = 5 /( 5 +) and 
strictly finer than =si(notJS)- 

Proposition 9. =si(b+) C =SI(B^VJnotJ3) C =SI(not_B)- 

Moreover the equivalence relation =si{B^unot_B) is compositional for {=si{notJ3) 

, {U^}), as proved by the following proposition. 

Proposition 10. =si{B^unot_HB) is compositional for {=si(notJ3)A^^})- 

5.2 A Family of Program Composition Operations 

A family of program composition operations has been studied in [7] . Besides the 
union operation, other three main composition operations have been considered: 
intersection (fl), encapsulation (*), and import (<). These operations are defined 
in a semantics-driven style, following the observation that if the meaning of 
a program P is denoted by the corresponding immediate consequence operator 
T{P), then such a meaning is a homomorphism for several interesting operations 
on programs. The formal semantics of the operations is defined as follows: 

T{PC^Q){I) = T{P){I) n T{Q){I) 

T{P*){I) = T“(P)(0) 

T{P < Q){I) = T{P) (/ U T‘"(Q)(0)) 

Such a set of basic composition operations forms an algebra of logic programs 
with interesting properties for reasoning about programs and program composi- 
tions. From a programming perspective, the operations enhance the expressive 
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power of logic programming by supporting a wealth of programming techniques, 
ranging from software engineering to artificial intelligence applications. 

A thorough discussion of the family of composition operations is outside 
the scope of the present paper. It is however worth mentioning here that [12] 
showed that the chain of equivalence relations =t C =T+id C =m C =lhm 
(see Figure 2) coincides with the chain of fully abstract compositional equiv- 
alence relations for subsets of a family of program composition operations, as 
summarized in Figure 4. 
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{*} 
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=LHM 


- 


- 


- 


FAC 



Fig. 4. The chain of fully abstract compositional equivalence relations. (“C” 
stands for compositional, “FAC” for fully abstract and compositional, while 
stands for non-compositional.) 



6 Concluding Remarks 

As we already mentioned in the Introduction, our analysis of compositional se- 
mantics for logic programs was not intended to be exhaustive, in the sense of 
analysing all the (many) different proposals that have been developed over the 
last ten years. 

We have rather tried to guide the reader across different compositional se- 
mantics for logic programs by highlighting the existing relations among them, 
and by establishing new relations and results on our way. 

During our tour, the notion of supported interpretation has been shown to 
provide a general and unifying mechanism for obtaining compositional deno- 
tations of both definite and normal programs, also in the case of asymmetric 
compositions. 
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Appendix 

This Appendix contains the proofs of the propositions and lemmas stated in the 
paper. To simplify the proofs, we will sometimes abuse notation and denote with 
the same symbol, I say, both an interpretation / and the program {h <— | h G /}. 



Lemma 1 

Let P he a program and let I C B. Then: 

l£M{P) 3H CB : I = LHM{PUH), 

Proof, {if pari) If / = LHM{PUH) then, by definition of least Herbrand model, 
I £ M{P U H) and, by Observation I, I £ M{P). 

{only-if part) We observe that if / G M{P) then I = LHM{PUI). By definition 
of least Herbrand model and by Observation 1, we have that: 

LHM{P U /) = min {J \ J £ M{P) A J G M{I)}, 

Then since / G M{P) and since / = LHM{I), we have that: LHM{PU I) = I. 



Lemma 2 

Let P and Q be two programs. Then: 

P=mQ \/H CB: LHM{PUH) = LHM{QUH) 

Proof, {if part) If / G M{P) then by Lemma I there exists H C B : I = 
LHM{P U H). By hypothesis this implies that / = LHM{Q U H) and hence, 
by Lemma 1, that / G M{Q). 

{only-if part) By definition of least Herbrand model and by Observation 1: 
LHM{P UH) = min{L \ L £ M{P) A / G M{H)}. 

Hence, since M{P) = M{Q) by hypothesis, we have that LHM{P U H) = 
LHM{Q U H) for any H C B. 



We now introduce the following two lemmas, which will be used in the proofs of 
Propositions 1 and 2, respectively. 

Lemma 3. Let P he a program, let K C H C B. Then: 

LHM{PUH)-LHM{PUK) £> H-K Hn{LHM{PUK)~ K) = i!). 

Proof, {if part) Suppose that LHM{P H) — LHM{P U K) H — K, that 
is, suppose that there exists x such that: x £ {H — K) A x ^ {LHM{P U iL) — 
LHM{PU K)). Since H C LH M{P U H) this means that: x £ {H — K) Ax £ 
LHM{PUK). Hence we have that there exists x: x £ {H n{LH M {PU K) — K)) . 
Contradiction. 

{only-if pa,rt) Suppose that there exists x such that: x G H Ax £ LHM{PLlK)A 
x ^ K. Then: x £ {H — K) Ax ^ {LHM{P\JH) — LHM{P\JK)). Contradiction. 
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Lemma 4. Let P and Q be definite programs. Then: 

LHM{P\JQ) D LHM{P)ULHM{Q), 

Proof. This corollary descends immediately from Observation 1. For instance: 
LHM{P U g) = min{I \ I e M{P U Q)) = min{I \ I G M(P) A / G M{Q)} D 
min{I I I G M(P)} = LHM{P). 



Proposition 1 

Let P he a program, let I C B and let H QB. 

I{H) G AM{P) 3K,J -.{K QH L J{K) G p.AM{P) A L = JU{H-K)), 

Proof, (case a) If I{H) G p.AM{P) then the assertion trivially holds (just take 
K = H). 

(case b) If I{H) ^ nAM{P) then, by definition of minimal admissible model: 
3K CH ■. LHM{P \JH)~ LHM{P U K) ^ H - K, 

By Lemma 3, this means that: 

3K C H : Hf] {LHM{P \JK)-K)^% 

Let X G H n (LHM{PU K) — K). We now show that: 

LHM{P UH) = LHM{P UH - {a;}) U {x}. 

Since x G LHM{PUK) then by Observation 1 we have that x G LHM{PUKUA) 
for any A C B. Therefore we have that x G LHM{P U AT U {H — {a;})), that is, 
since x ^ K , x € LHM{PU {H — {a;})). Now if a; G LHM{PU {H — {a;})) then 
LHM{P UH) = LHM{P U H - {a;}) and hence LHM{P U H) = LHM{P U 
H — {a;}) U {a;}. 

Let now J = LHM{P U H — {a;}). If J{H — {a;}) G p,AM{P) then the 
statement is proved. Otherwise we repeatedly apply the reasoning of (case b) 
to obtain a decreasing chain of sets of hypotheses Kq D Ki D K 2 D . . . where 
Kq = H and where for each Kp 

- MKi) G AM{P), where J, = LHM{P U K,), and 

- Ji = Ji+i U {Ki — Ki+i). 

Such a chain has a greatest lower bound K = Ki such that J{K) G p.AM{P), 
in the worst case K being the empty set. We therefore have that: LHM(PUH) = 
LHM{PUK)U{H -K). 



Proposition 2 

= M = =fiAM ■ 

Proof. (C) We first show that: 

P =M Q P =fj.AM Q- 
Suppose that: 

3P, Q : P =M Q L. P ^fiAM Q, 

namely 

3L,H : I{H) G plAM{P) A L{H) fxAM{Q). 
(The other case is analogous.) 
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(i) Suppose that H C AH{Q). Since I{H) G fiAM{P) then: 

WK CH : LHM{P \J H) - LHM{P UK)d H -K. 

Since P =m Q then, by Lemma 2: 

WK C H : LHM{Q U H) - LHM{Q UK)d H -K. 

Therefore we have that I{H) G nAM{Q). Contradiction. 

(ii) Suppose that H ^ AH{Q). 

Then 3n : n G H A n ^ AH(Q). Let K = H — {n}. 

• If n G LHM{Q U K) then 

LHM{Q UKU {n}) = LHM{Q U K), 
that is, 

LHM{Q UH) = LHM{Q U K). 

Therefore, since P =m Q, by Lemma 2: 

3K C H LHM{P \JH)~ LHM{P U iC) = 0. 

Therefore I{H) ^ fiAM{P). Contradiction. 

• Suppose instead that n ^ LHM{Q U K). Since n ^ AH{Q): 

LHM{Q UH) = LHM{Q U K) U {n}. 

Then, since n ^ LHM{Q U K): 

LHM{Q UP) - LHM{Q \J K) = {n}. 

Therefore since P =m Q and by Lemma 2: 

3K CH ■. LHM{P UP) - LHM{P UK)^ H - K. 

Hence /(P) ^ iiAM{P). Contradiction. 

(D) We now show that: 

P =^iAM Q P =M Q- 
Suppose that: 

3P, Q : P =f_iAM Q A P Q 
that is: 

3/ : / G M(P) A I ^ M{Q). 

(The other case is analogous.) 

Observe that 
PPM(PU/) = I 

since, by Observation 1, LHM{P U /) = min{J \ J G M{P) A J G M{I)} 
and since I = LHM{I) and I G M{P). Moreover: 

LHM{QUI) D I 

since, by Observation 1, if J G M(Q U /) then J A I and since I ^ M{Q). 
Let now Iaq = / C AH{Q) and let Inq = I — Iaq- Then: 

LHM{Q U /) = LHM{Q U Iaq) U Inq- 
Since, by Lemma 4: 

I = LHM{P UI)A LHM{P U /aq) U LHM(Inq) = LHM{P U /aq) U 
Inq 

we observe that: 

LH M[P U Iaq) C I and Inq C I . 

Therefore, since LHM{QUI) A I and LHM(QUI) = LHM{QUIaq)AInq 
and Inq Q I, we observe that: 

LHM{Q U Iaq) % I- 

Consider now N = LHM{Q U Iaq)- Since N{Iaq) G AM{Q) then, by 
Proposition 1, 3K, J: 
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K CIaq a J{K) G fiAM{Q) A LHM{Q U Iaq) = LHM{Q U AT) U 
{Iaq-K). 

Since LHM{Q U Iaq) % I and since Iaq C J, this implies that: 
LHM{QUK) 2 I. 

Since J{K) G fiAM{Q) and since P =^am Q then J{K) G fiAM{P) and 
hence: 

LHM(PUK) % I. 

On the other hand, since K C Iaq'- 

I 2 LHM{P U Iaq) U Inq 2 LHM{P U Iaq) 2 LHM{P U K). 
Contradiction. 



Proposition 3 

=M = =SI{B)- 

Proof. (C) If P =M Q then, by Lemma 2: 

VH CB: LHM{P \J H) = LHM{Q U H). 

Hence: 

{I{H) \ H 2 B M = LHM{P U H)} = {1(H) \ H C B A I = 

LHM(QUH)} 
that is: 

P =SI(B) Q- 

(O) By Lemma 1, if / G M{P) then 3H : I{H) G SI{P). Since SI{P) = SI{Q) 
by hypothesis, then I{H) G SI{Q) and hence I G M{Q) by Lemma 1. 



Proposition 4 

Let H Q B , K, 2 B be two sets of assumable hypotheses. Then: 

(1) H 2 1C =si(ic) 2 =si(n) 

(2) H CJC =SI{K) C =SI{H) 

Proof. (1) We show that if C /C then (P =si(fc) Q P =sun) Q)- Indeed 
if I{H) belongs to the set SI{P,H) then it also belongs to SI{P,K.) since 
H 2 1C. Since P =s/(k:) Q by hypothesis, I{H) also belongs to SI{Q,JC), as 
well as to SI(Q,H) since H 2 1C. 

(2) We now show that if C /C then =si(^^'^C.=si{n)- Since we know from (1) 
that =si{K)2=Si(n)^ we just have to show that there exist two programs P 
and Q such that P ^si(ic) Q and P =si(-h) Q- 
Let b Q 1C and b ^H. Consider the program 

P 

a ^ b 

and let Q be the empty program. It is easy to see that P =si(n) Q since: 

si{p,n) = si{Q,n) = {/(/) \i2n}. 

On the other hand P ^si(JC) Q since {a, 6}({6}) G SI{P,JC) 
while {a,6}({6}) ^S'/(g,/C). 
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Proposition 5 

=Si{B+) compositional for (=s/(noi_6), {U}). 

Proof. Since notJ3 C we have, by Proposition 4, that =si{B+)^=si(notJ3)j 
that is =si{B+) preserves =si(notJ3)- 

Moreover =si(s+) is a congruence for {U} since =m is a congruence for {U} 
and since =m==si(b+) by Proposition 3. 



Proposition 6 

=Si{B+) is fully abstract for (=s/(noi_f 3 ), {U}). 

Proof. By counter-positive, we show that if P ^si(B+) Q then there is a context 
in which the substitution of P with Q does modify the set of negatively supported 
interpretations. Let P and Q be two programs such that P Q. 

(1) We first show that there exists a finite program R such that: 

LHM{P UR)^ LHM{Q U R). 

Since we are considering positive versions of programs, P and Q are definite 
programs and hence, by Proposition 3: 

P ^SIiB+) Q P Q 

This means that: 

3ICB+: I G M{P) AI ^ M{Q). 

(The other case is analogous.) That is, by definition of Her brand model: 

31 C B+ : T{P){I) CIA T{Q){I) % I. 

Therefore, by definition of T{Q): 

3F,A: F Cl A F finite A H G T{Q){F) A A cfF. 

Consider now the program: 

R={B^\Bg T’}. 

• We now show that A ^ LHM{P C R), that is, A ^ T^{PC i?)(0). 

We first show, by induction on n, that: 

Vn : T"(PUi?)(0) C /. 

Indeed T^{P U R)(fb) = 0. Assume now that T^{P U i?)(0) C I. By 
definition of powers of T : 

T"+i(P U i?)(0) = T{P U R){T^{P U i?)(0)). 

By inductive hypothesis and by the monotonicity of T : 

T"+i(P U i?)(0) C T{P U R){I). 

Then by definition of U: 

T”+i(P U i?)(0) C T{P){I) U T{R){I). 

Since T{P){I) C I and since F C I, we then have that: 

T”+i(PUi?)(0) C I. 

Therefore, by the continuity of T : 

T“(PUi?)(0) C I 

and hence, since A ^ I, we conclude that: 

A^ LHM(PUR). 
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• We now show that: A G LHM{QUR), that is, A G T“(QUi?)(0). Indeed: 

^ G T{Q){F) C T{Q){T{R)m 
By definition of U and by monotonicity of T : 

T(g)(T(i?)(0)) C T{Q){T{Q U i?)(0)) C T{Q U R){T{Q U i?)(0)) 
hence 

A G T^(Q U i?)(0) C T“(g U R)(0) 
by continuity of T. 

(2) We have therefore shown that if P ^ 5 /( 5 +) Q then 
BR: LHM{PUR)^ LHM{QUR). 

Notice that, since we are considering positive versions of programs, R can 
be a finite program of the form: 

R={B^\BgP} 

where P C (^B U notJ3). Namely R may be a finite program of the form: 

R 

Cl ^ 

not-Di ^ 
not-Dn ^ 

Let now = {C C G PnB} and R~ = {not-D not.D G Pr\notJ3}. 

• We now observe that if LPIM{P U R'^) yf LHM{Q U R~^) then: 

31, J : 7(0) G SI{PyjR+, not^) A 7(0) G S'/(gui?+, not^) A 7 yf 7. 
This means that there exists a normal program 7?+ such that: 

P U i?+ ^sKuotJi) g u 7?+ 
and hence concludes the proof. 

• If instead LH M{PU R~^) = LHM{QU R'^) then, since LHM{PU R) yf 
LHM{Q U i?), we observe that: 

LHM{P U i?+ U R-) yf LHM{Q U i?+ U R~). 

Let P~ = {not^D \ not-D <— G i?“}. Then: 

37,7: I{P~) e SI{PUR+,not^) A J{P-) £ SI{Q U R+ ,not^) 

A 7yf 7. 

That is, there exists a normal program 7?+ such that: 

P U i?+ ^SI(notJ3) Q U 7?'*". 



Proposition 7 

= SI(B) C =si(B„) C =LHM- 

Proof. By Proposition 4 we know that ABt^ C B then =si(b)‘^=SI{B„)- Moreover 
if P =si(B„) Q then LHM{P) = LHM{Q), that is, =si(B^)fk=LHM- To show 
that =si(b.„)¥^=lhm 1 consider the programs 

P Q 



a 



c 
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where {b,c} C It is easy to observe that P =lhm Q while P Q 

since for instance {o,6}({6}) G SI{P,Btt) — SI{Q,Btt)- 



Proposition 8 

=Si(B^) is compositional w.r.t (0,{U^}). 

Proof. Indeed preserves O by Prop. 7. 

Moreover is a congruence for U.^-, namely VP, Q: 

P =SI(B„) Q P D =si(B^) Q 'Jtt D 
for any extension D. 

Let H f- Btt- We observe that, since both D and H are extensional programs: 
LHM{{P D)\JH) = LHM{P LHM{D U H)). 

Therefore, since defs{D LI H) C n and since P =si(b^) Q- 

LHM{P LHM{D U H)) = LHM{Q LHM{D U H)) 
and hence 

LHM{{P D)LH) = LHM{{Q D) U H). 



Proposition 9 

= SI(B+) C =SI(B„Unot^) C =SI(not_B)- 

Proof. Immediate by Proposition 4, whenever tt yf 0 (and Bt^ C B). 



Proposition 10 

— SI{BT^\Jnot_B) is compositional for (=5/(not_S)5 {b^Tr})- 
Proof. The proof is analogous to the proof of Proposition 8. 
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Abstract. One recent advance in program development has been the 
application of abstract interpretation to verify the partial correctness of 
a (constraint) logic program. Traditionally forwards analysis has been 
applied that starts with an initial goal and traces the execution in the 
direction of the control-flow to approximate the program state at each 
program point. This is often enough to verify assertions that a property 
holds. The dual approach is to apply backwards analysis to propagate 
properties of the allowable states against the control-flow to infer queries 
for which the program will not violate any assertion. Backwards analysis 
also underpins other program development tasks such as verifying the 
termination of a logic program or proving that a logic program with a 
delay mechanism cannot reduce to a state that contains sub-goals which 
suspend indefinitely. This paper reviews various backwards analyses that 
have been proposed for logic programs, identifying common threads in 
these techniques. The analyses are explained through a series of worked 
examples. The paper concludes with some suggestions for research in 
backwards analysis for logic program development. 



1 Introduction 

Abstract interpretation has an important role in program development and 
specifically the verification and debugging of (constraint) logic programs, as re- 
cently demonstrated in [12,42,57]. In this context, programmers are typically 
equipped with an annotation language in which they encode properties of the 
program state at various program points [56,64]. One approach to verification 
of logic programs is to trace the program state in the direction of control-flow 
from an initial goal (forwards analysis), using abstract interpretation to finitely 
represent and track the state. The program is deemed to be correct if all the 
assertions are satisfied whenever they are encountered during the execution of 
the program; otherwise the program is potentially buggy. The dual approach is 
to trace execution against the control-flow (backwards analysis) to infer those 
queries which ensure that the assertions are satisfied should they be encountered 
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[39,44]. If the class of initial queries does not conform to those expected by the 
programmer, then the program is potentially buggy. 

Many program properties cannot be simply verified by checking (an abstrac- 
tion of) the program store - they are properties of sequences of program states. 
For example, termination checking of logic programs attempts to verify that a 
logic program left-terminates for a given query [10,53]. This amounts to check- 
ing that sequences of program states are necessarily finite. Suspension analysis 
of concurrent logic programs [8,11,17] also reasons about sequences of states. It 
aims to verify that a given state cannot lead to another which possesses a sub- 
goal that suspends indefinitely. These classic analyses are inherently forwards 
since they trace sequences of states in the direction of control-flow. Nevertheless 
these forwards analysis problems (and various related analyses) have correspond- 
ing backwards analysis problems, tracing requirements against the control-flow 
(specifying the backwards analysis will be referred to as reversal) . The reversal of 
termination checking is termination inference which infers initial queries under 
which a logic program left-terminates [24,53]. The reversal of suspension analy- 
sis is suspension inference [26] which infers a class of goals that will not lead to 
suspended (floundering) sub-goals. It has been observed [24] that the “missing 
link” between termination inference and termination checking is the backwards 
analysis of [39] . Likewise suspension inference [26] relies on ideas inherited from 
backwards analysis [39]. 

The unifying idea behind these various backwards analyses [24,26,39,40,44] 
is reasoning about reverse information flow. In abstract interpretation, infor- 
mation is represented, albeit in an approximate way, with an abstract domain 
which is a lattice (I?, <, ©, ®). The ordering < expresses the relative precision 
of two domain elements; the join © models the merging of computation paths 
whereas meet © models the conjunction of constraints. To propagate informa- 
tion against the control-flow, the analyses of [24,26,39,40] (but notably not that 
of [44]) require the abstract domain D to be relatively pseudo-complemented, 
that is, the relative pseudo-complement of two domain elements uniquely ex- 
ists. The pseudo-complement of d\ relative to (I 2 , denoted d\ ^ ^ 2 , delivers the 
weakest element of D whose conjunction with d\ implies ^ 2 , or more exactly, 
d\ — > ^2 = ©{d G D I d® d\ <c? 2 }- The role of relative pseudo-complement is 
that if ^2 expresses a set of requirements that must hold after a constraint is 
added to the store, and d\ models the constraint itself, then d\ — > c ?2 expresses 
the requirements that must hold on the store before the constraint. Relative 
pseudo-complement is central to many backwards analyses. 

Not all domains come equipped with a relative pseudo-complement, but it 
turns out that it is always possible to synthesise a domain for backwards analysis 
for some given downward closed property by applying Heyting completion [30] . 
Heyting completion enriches a domain with new elements so that the relative 
pseudo-complement is well-defined. All domains that are condensing possess a 
relative pseudo-complement. Examples of condensing domains include the class 
of positive Boolean functions [37,46], the relational type domain of [9], and the 
domain of directional types [1,30]. The requirement for a domain to be rela- 
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lively pseudo-complemented is one the major restrictions of backwards analysis. 
Despite this limitation, backwards analysis still offers two key advantages over 
forwards analysis for program development problems. These advantages are sum- 
marised below: 

~ Backwards analysis generalises forwards analysis in that a single application 
of backwards analysis can subsume many applications of a forwards analysis. 
Another advantage that relates to ease of use is that backwards analysis is 
not dependent on the programmer (or the module system) for providing a 
top-level goal. This is because backwards analysis is goal-independent. 

— In terms of implementation, backwards analysis strikes a good balance be- 
tween generality and simplicity. Moreover, backwards analysis does not nec- 
essarily incur a performance penalty over forwards analysis. 

Both of these two points are multi-faceted and therefore they require some 
unpacking. Returning to the first point - the issue of generality - forwards anal- 
ysis is driven from a top-level goal. Forwards analysis then verifies that the goal 
(and those goals it recursively calls) satisfy a set of requirements. Conversely, 
backwards analysis infers a class of goals all of which are guaranteed to satisfy 
the requirements. Under certain algebraic conditions this class is maximal with 
respect to forwards analysis [40]; it describes all those goals that can be verified 
with forwards analysis. For example, consider the problem of understanding how 
to re-use code developed by a third party. In the context of logic programming, 
part of this problem reduces to figuring out how to query a program. If the logic 
program does not come with any documentation, then the programmer is forced 
to experiment with queries in an ad hoc fashion. More systematically, forwards 
analysis could be repeatedly applied to discover queries which are consistent 
with the called builtins in that the calls do not generate any instantiation er- 
rors. By way of contrast the backwards analysis framework when instantiated 
with a domain for tracking groundness dependencies [37,46] yields an analysis 
for mode inference which would discover (in a single application) all queries that 
will not generate any instantiation errors. This recovered mode information then 
provides valuable insight into behaviour of the program. 

Expanding the second point - the issue of implementation - the analyses 
presented in [26,39,40] reduce to two simple bottom-up fixpoint computations: a 
least fixpoint (Ifp) and a greatest fixpoint (gfp) . The Ifp and the gfp calculations 
can be ordered and thus de-coupled. This significantly simplifies the tracking 
of dependencies which is the main source of complexity in an efficient fixpoint 
engine. Moreover, although few forwards analyses have been compared exper- 
imentally against backwards analyses, the notable exception is in termination 
inference. In this context, the speed of inference appears to at least match that 
of checking [24] . In fact the total analysis time for checking and inference can be 
broken down into Joint - the time spent on activities common to both checking 
and inference and Inf and Check - the time spent on activities specific to infer- 
ence and checking respectively. Joint dominates both Inf and Check but Inf is 
typically smaller than Check [24] . Therefore the generality of inference does not 
necessarily incur a performance penalty. 
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Backwards analysis has been applied extensively in functional programming 
in, among other things, projection analysis [65], stream strictness analysis [31], 
inverse image analysis [18]. Furthermore, backwards reasoning on imperative 
programs dates back to the early days of static analysis [14]. In contrast, back- 
wards analysis has until very recently [19,22,24,26,40,44,49] been rarely applied 
in logic programming. The aim of this paper is thus to promote the use of back- 
wards analysis especially within the context of logic program development. To 
this end, the paper explains the key ideas behind backwards analyses for mode 
inference, termination inference, suspension inference and type inference. These 
analyses are each described in an informal way through a series of worked ex- 
amples in sections 2, 3, 4 and 5 respectively. Each of these sections includes its 
own related work section. Section 6 then reviews directions for future work on 
backwards analysis for program development and section 7 concludes. 

2 Backwards Mode Inference 

The objective of backwards analysis is to infer queries for which the program 
is guaranteed to either not violate any assertion or satisfy some operational 
requirement. To realise this objective, backwards analyses propagate require- 
ments of the allowable states against the control-flow. This tactic essentially 
reinterprets the calculation of weakest pre-conditions [35] for logic programming 
using abstract interpretation techniques. To illustrate these ideas, consider the 
problem of mode inference [39]. In mode inference, the problem is to deduce 
moding properties which, if satisfied by the initial query, ensure that the result- 
ing derivations cannot encounter an instantiation error. Instantiation errors arise 
when a builtin is called with insufficiently instantiated arguments. For example, 
the Prolog builtins tab and put require their first (and only) argument to be 
bound to a ground term otherwise they error. Conversely, the builtin is requires 
its last argument to be ground. Other builtins such as the arithmetic tests =:=, 
<, >, etc require both arguments to be ground. These grounding requirements 
can be expressed with the domain of positive Boolean functions, Pos, which is 
traditionally used to track groundness dependencies [37,46]. Pos is the set of 
functions / : {true, false}'^ {true, false} such that /{true, . . . , true) = true. 
For example, XA (Y <— Z) € Pos since true A {true <— true) = true. The formula 
describes states in which X is ground and Y is ground whenever Z is ground. Ob- 
serve that this grounding property is closed under instantiation: if X A (Y <— Z) 
describes the state of the store, then both X and Y ^ Z still hold whenever the 
store is conjoined with additional constraints. When augmented with false, Pos 
forms the lattice {Pos, ]=, V, A) where \= denotes the entailment ordering, A is 
logical conjunction and V is logical disjunction. The top and bottom elements 
of the lattice are true and false respectively. 

The assertions that are used in mode inference are Pos abstractions that ex- 
press grounding requirements which ensure that instantiation errors are avoided. 
Specifically, an assertion is added to the program for each call to a builtin. The 
assertion itself precedes the call [39] . It is important to appreciate that the as- 
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sertions only codify sufficient conditions; necessary conditions for the absence of 
instantiation errors cannot always be expressed within Pos. For example, the 
assertion X V (Y A Z) describes states for which the builtin functor (X, Y, Z) 
will not produce an instantiation error. Observe, however, that the same call will 
not generate an instantiation error if X is bound to a non- variable (non-ground) 
term such as [W|Ws] , hence X V (Y A Z) is not a necessary condition for avoiding 
an instantiation error. Observe too that X V (Y A Z) only ensures that the call will 
not generate an instantiation error. For instance, a domain error will be thrown 
whenever functor (X, Y, Z) is called with Z is instantiated to a negative inte- 
ger. However a richer domain, such as the numeric power domain introduced in 
[40], could express this positivity requirement on the variable Z. (The subtlety of 
reasoning about builtins is not confined to backwards analysis. In fact correctly 
and precisely encoding the behaviour of the builtins is often the most difficult 
part of any analysis [33,36].) 

2.1 Worked Example on Mode Inference 

To appreciate how the assertions and lattice operations V and A fit together and 
why the domain is required to be relatively pseudo-complemented, it is helpful 
to consider a worked example. Thus consider the quicksort program listed in 
Figure 1 and the problem of computing those queries that avoid instantiation 
errors. The quicksort program is coded in Prolog and therefore the comma op- 
erator denotes sequential (rather than parallel) goal composition. A difference 
list is used to amortise the cost of appending the two lists produced by the goals 
qs(L, S, [M 1 R]) andqsCH, R, T). 



qs( 


□ , 


s, s) 




- true . 














qs( 


[M 


1 Xs], 


s 


, T) 


pt(Xs, 


M, L, 


H), 


qs(L, 


s. 


[M 


1 R]), qs(H, R, T) 


pt( 


[], 


[] 




[]) : 


: - true . 














pt( 


[X 


1 Xs], 


M 


, [X 


1 L] , H) 


M < 


X, 


pt (Xs , 


M, 


L, 


H). 


pt( 


[X 


1 Xs], 


M 


, L, 


[X 1 H]) 


M > 


X, 


pt (Xs , 


M, 


L, 


H). 



Fig. 1. quicksort program in expressed in Prolog 



The backwards analysis consists of two computational steps. The first is a 
least fixpoint (Ifp) calculation and the second is a greatest fixpoint (gfp) compu- 
tation. The Ifp is an analysis on its own right. It infers success patterns that are 
required for the gfp computation. The success pattern for a given predicate char- 
acterises the bindings made by the predicate whenever it succeeds; in this context 
the success patterns are described as groundness dependencies. Specifically, the 
Ifp is a set of calls paired with groundness dependencies which describe how a 
call to each predicate in the program can succeed. The gfp is an analysis for in- 
put modes (the objective of the backwards analysis). To simplify both steps, the 
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program is put into a form in which the arguments of head and body atoms are 
distinct variables. This gives the normalised program listed in the first column of 
Figure 2. To clearly differentiate assertions from the (Herbrand) constraints that 
occur within the program, the program is expressed in the concurrent constraint 
style [ 60 ] using ask to denote an assertion and tell to indicate a conventional store 
write. This notation (correctly) suggests that an assertion reads and checks a 
property of the store. Empty conjunctions of atoms are denoted by true. The 
process of normalisation does not introduce any assertions and therefore the 
program in the first column of Figure 2 includes only tell constraints. Note that 
each clause contains a single tell constraint which appears immediately before 
the (normalised) atoms that constitute the body of the clause. 

After normalisation, the program is abstracted by replacing each tell con- 
straint X = f{xi, . . . , Xn) with a formula x Af^^Xi that describes its ground- 
ness dependencies. This gives the abstract program listed in the second column 
of Figure 2. Builtins that are called from the program, such as the tests < and 
>, are handled by augmenting the abstract program with fresh predicates, <' 
and >' , which capture the grounding behaviour of the builtins. Assertions are 
introduced immediately after the head of these fresh clauses which specify a 
mode that is sufficient for the builtin not to generate an instantiation error. For 
example, the ask formula in the <' clause asserts that the < test will not error if 
its first two arguments are ground, whereas the tell formula describes the state 
that holds if the test succeeds. For uniformity, all clauses contain both an ask 
and a tell. This normal form simplifies the presentation of the theory and well 
as the structure of the abstract interpretation itself. In practise, the ask of most 
clauses are true and thus vacuous. In the case of quicksort, the only non-trivial 
assertions arise from builtins. This would change if the programmer introduced 
assertions for purposes of verification [56] . 

2.2 Least Fixpoint Calculation 

An iterative algorithm is used to compute the Ifp and thereby characterise the 
success patterns of the program. A success pattern is a pair consisting of an 
atom with distinct variables for arguments paired with a Pos formula over those 
variables which describes the groundness dependencies between the arguments. 
Renaming and equality of formulae induce an equivalence between success pat- 
terns which is needed to detect the fixpoint. The patterns (p(u, w, v),uA{w ^ v)) 
and {p{xi,X2,X3), (xs ^ X2) A a;i), for example, are considered to be identical: 
both express the same inter-argument groundness dependencies. Each iteration 
produces a set of success patterns: at most one pair for each predicate in the 
program. 

Upper Approximation of Success Patterns A success pattern records 
an inter-argument groundness dependency that describes the binding effects 
of executing a predicate. If {p{x),f) correctly describes the predicate p, and 
g holds whenever / holds, then {p{x),g) also correctly describes p. Note that 
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qs(Tl, S, T2) 


qs(Tl, S, T2) 


telKTl = [] , T2 = S) , 


ask (true) , 


true . 


telKTl A (T2 ^ S) , 


qs(Tl, S, T3) 


true . 


telKTl = [MjXs] , T3 = [MjR] ) , 


qs(Tl, S, T3) 


pt(Xs, M, L, H), 


ask (true) , 


qs(L, S, T3), 


telKTl ^ (M A Xs) A T3 +-> (MAR)) , 


qs(H, R, T) . 


pt(Xs, M, L, H), 
qs(L, S, T3) , 


pt(Tl, T2, T3) 

tell(Tl=[], T2=[], T3=[]), 


qs(H, R, T). 


true . 


pt(Tl, T2, T3) 


pt(Tl, M, T2, H) 


ask (true) , 


telKTl = [X|Xs] , T2 = [X|L] ) , 


telKTl A T2 A T3) , 


M < X, 


true . 


pt(Xs, M, L, H). 


pt(Tl, M, T2, H) 


pt(Tl, M, 1, T2) 


ask (true) , 


telKTl = [X|Xs] , T2 = [X|H] ) , 


telKTl +-> (X A Xs) A T2 4-^ (X A L)) , 


M > X, 


<’(M, X), 


pt(Xs, M, L, H). 


pt(Xs, M, L, H) . 
pt(Tl, M, L, T2) 
ask (true) , 

telKTl ^ (X A Xs) A T2 4-> (X A H)) , 
>’(M, X), 
pt(Xs, M, L, H) . 

<’(M, X) 

ask (MAX), tell (MAX), true. 
>’(M, X) 

ask (MAX), tell (MAX), true. 


Fig. 2. quicksort program with assertions and as a Pos abstraction 



here and henceforth x denotes a vector of distinct variables. Success patterns 
can thus be approximated from above without compromising correctness. Iter- 
ation is performed in a bottom-up fashion, Tp-style, [28] and commences with 
= {{p{x), false) I p G 77} where 77 is the set of predicates occurring in the 
program. Fq is the bottom element of the lattice of success patterns; the top 
element is {{p{x),true) | p G 77}. Fj^i is computed from Fj by considering each 
clause p{x) :- ask(d), tell(/),pi(£Ci), . . . ,p„(x„) in turn. It is at this stage that 
the lattice structure of Pos comes into play. Meet (the operator A which is also 
known as greatest lower bound) provides a way of conjoining information from 
different body atoms, while join (the operator V which is also known as least 
upper bound) is used to combine the information from different clauses. More ex- 
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actly, the success pattern formulae fi for the n body atoms pi(a;i), . . . ,p„(a;„) are 
conjoined with / to obtain g = f A (A”^^/i). Variables not present inp(£c), Y say, 
are then eliminated from g. The Schroder elimination principle provides a way 
of eliminating a variable from a given formula. It enables a projection operator 
3a; to be defined by 3x(f) = f[x true] V f[x false] which eliminates x from 
/. Since / h computing g' = 3y(g) where ^{y^...y„}(,g) = 3j^a(. • .3y„(g)) 

weakens g. Weakening g does not compromise correctness because success pat- 
terns can be safely approximated from above. 



Weakening Upper Approximations The pattern {p{x),g”) where g” is the 
current Pos abstraction is then replaced with (p{x), g'y g") where g' is computed 
as above. Thus the success patterns become progressively weaker (or at least 
not stronger) on each iteration. Again, correctness is preserved because success 
patterns can be safely approximated from above. 



Least Fixpoint Calculation for Quicksort The Ifp for the abstracted quick- 
sort program is obtained (and checked) in the following 3 iterations: 

{ (qs(a:i,a;2,a;3),a;i A {x 2 ^ x^))' 

(pt(xi, X2, X 3 , 0:4), a;i A X 3 A 0:4) 

{=<'{xi,X 2 ),Xi AX2) 

{>'{xi,X2),Xi AX 2 ) 

{ {qs{xi,X 2 ,X:i),X 2 ^ (a:i A X3))' 

{pt{xi,X2,x:i,X4),xi Ax:i Axi) 

{=<'{xi,X2),Xi AX2) 

{>'{xi,X2),Xi AX2) 

Finally, F 3 = F 2 . The space of success patterns forms a complete lattice which 
ensures that a Ifp exists. The iterative process will always terminate since the 
space is finite and hence the number of times each success pattern can be updated 
is also finite. Moreover, it will converge onto the Ifp since (so-called Kleene) 
iteration commences with the bottom element Fq. 

Observe that F 2 , the Ifp, faithfully describes the grounding behaviour of 
quicksort: a qs goal will ground its second argument if it is called with its first 
and third arguments already ground and vice versa. Note that assertions are not 
considered in the Ifp calculation. 

2.3 Greatest Fixpoint Calculation 

A bottom-up strategy is used to compute a gfp and thereby characterise the 
safe call patterns of the program. A safe call pattern describes queries that do 
not lead to violation of the assertions. A call pattern has the same form as a 
success pattern (so there is one call pattern per predicate rather than one per 
clause) . The analysis starts by checking that no call causes an error by reasoning 
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backwards over all clauses. If an assertion is violated, the set of safe call patterns 
for the involved predicate is strengthened (made smaller), and the whole process 
is repeated until the assumptions turn out to be valid (the gfp is reached). 



Lower Approximation of Safe Call Patterns Iteration commences with the 
top element Dq = {{p{x),true) \ p G 7T}. An iterative algorithm incrementally 
strengthens the call pattern formulae until they only describe queries which lead 
to computations that satisfy the assertions. Note that call patterns describe a 
subset, rather than a superset, of those queries which are safe. Call patterns 
are thus lower approximations, in contrast to success patterns which are up- 
per approximations. Put another way, if (p{x),g) correctly describes some safe 
call patterns of p, and g holds whenever / holds, then {p{x),f) also correctly 
describes some safe call patterns of p. Call patterns can thus be approximated 
from below without compromising correctness (but not from above). D^+i is 
computed from Dk by applying each p{x) ask(d), tell(/),pi(a;i), . . . ,Pn{xn) in 
turn and calculating a formula that characterises its safe calling modes. A safe 
calling mode is calculated by propagating moding requirements right-to-left by 
repeated application of the logical operator More exactly, let fi denote the 
success pattern formula for Pi{xi) in the previously computed Ifp and let di de- 
note the call pattern formula for Pi{xi) in Dk- Set e„+i = true and then compute 
6i = diA {fi Ci+i) for 1 < t < n. Each Cj describes a safe calling mode for the 
compound goal Pi{xi), . . . ,pn{xn). 



Intuition and Explanation The intuition behind the symbolism is that di 
represents the demand that is already known in order for Pi{xi) not to error 
whereas is di possibly strengthened with extra demand so as to ensure that the 
sub-goal pi+i{xi+i), . . . ,Pn{xn) also does not error when executed immediately 
after pi{xi). Put another way, anything larger than di may possibly cause an 
error when executing Pi{xi) and anything larger than Ci may possibly cause an 
error when executing pi{xi), . . . ,p„(a;„). 

The basic inductive step in the analysis is to compute an Ci which ensures that 
Pi{xi), . . . ,pn{xn) does not error, given di and e^+i which respectively ensure 
that Pi{xi) and pi_|_i(a;i_|_i), . . . ,p„(®„) do not error. This step propagates a 
demand after the call to Pi{xi) into a demand before the call to pi{xi). The 
tactic is to set e„+i = true and then compute Ci = di A {fi ^ ^i+i) for i < n. 
This tactic is best explained by unfolding the definitions of e„, then e„_i, then 
e„_ 2 , and so on. This reverse ordering reflects the order in which the Ci are 
computed; the Cj are computed whilst walking backwards across the clause. 
Any calling mode is safe for the empty goal and hence e„+i = true. Note that 
On = dn A {fn e„+i) = dn A {~^fn V true) = dn- Hence e„ represents a safe 
calling mode for the goal Pn{xn). 

Observe that Ci should not be larger than di, otherwise an error may oc- 
cur while executing pi{xi). Observe too that if pi{xi), . . . ,Pn{xn) is called with 
a mode described by di, then pi+i{xi+i), . . . ,pn{xn) is called with a mode 
described by {di A fi), since fi describes the success patterns of Pi{xi). The 
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mode {di A fi) may satisfy the e^+i demand. If it does not, then the mini- 
mal extra demand is added to {di A fi) so as to satisfy ej+i. This minimal 
extra demand is {{di A fi) Si+i) ~ the weakest mode that, in conjunction with 
{di A fi), ensures that Cj+i holds. Put another way, {{di A fi) —>■ Ci+i) = V{/ G 
Pos I {di A /i) A / \= Ci+i}. Combining the requirements to satisfy Pi{xi) and 
then pi+i{xi+i), . . . ,p„{x„), gives e^ = di A ((d* A /*) ^ e*+i) which reduces to 
€i = di A {fi Ci+i) because of algebraic properties of condensing domains [ 30 ] 
and yields the tactic used in the basic inductive step. 



less_than_one(X, Flag) X j 1, Flag = 1. 
less_than_one(X, Flag) 1 =| X, Flag = 0. 

less.than(X, Y) X i Y. 

Fig. 3 . The less_than_one and less_than predicates 



To illustrate how requirements are combined for compound queries, con- 
sider the predicates less_than_one and less_than given in figure 3 . The first 
predicate uses a flag to indicate whether its first argument is less than one; 
the second predicate is a test which succeeds if and only if its first argument 
is less than its second. In particular consider the (artificial) compound query 
less_than_one(X, Flag), less_than(X, Flag) which also succeeds whenever 
X is less than one. Observe that the query less_than_one(X, Flag) will not 
admit an instantiation error if the query is called with X sufficiently instan- 
tiated, that is, if X is ground. It is natural for this property also to hold for 
the compound query, since declaratively it encodes the same behaviour (albeit 
with some redundancy). However, reasoning about the instantiation require- 
ments for less_than_one(X, Flag), less_than(X, Flag) is more subtle be- 
cause the first sub-goal instantiates Flag thereby partially discharging the in- 
stantiation requirements of the second sub-goal. Moreover, the requirement that 
X is ground for the first sub-goal ensures that the same requirement is satisfied 
in the second sub-goal. Observe that this interaction is faithfully modelled by 
Ci = di A {fi Ci+i). Specifically, with pi{xi) = less_than_one(X, Flag) and 
P2{x2) = less_than(X, Flag), the demand and success patterns for pi{x\) and 
P2{x2) are as follows di = X and /i = X A Flag and c?2 = X A Flag and /2 = 
X A Flag. Then 

63 = true 

62 = d2 A (/2 ^ 63) = (X A Flag) A ((X A Flag) true) = (X A Flag) 

61 = di A (/i ^ 62) = (X) A ((X A Flag) ^ (X A Flag)) = X 



Thus it is sufficient to ground X in order to avoid an instantiation error in the 
compound goal. 
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Pseudo-complement This step of calculating the weakest mode that when 
conjoined with di A fi implies Cj+i, is the very heart of the analysis. Setting 
6i = false would trivially achieve safety, but should be as weak as possible 
to maximise the class of safe queries inferred. For Pos, computing the weakest 
6i reduces to applying the ^ operator, but more generally this step amounts to 
applying the relative pseudo-complement. This operation (if it exists for a given 
abstract domain) takes, as input, two abstractions and returns, as output, the 
weakest abstraction whose conjunction with the first input abstraction is at least 
as strong as the second input abstraction. If the domain does not possess a rela- 
tive pseudo-complement, then there is not always a unique weakest abstraction 
(whose conjunction with one given abstraction is at least as strong as another 
given abstraction). 

To see this, consider the domain Def [2,38] which does not possess a relative 
pseudo-complement. Def is the sub-class of Pos that is definite [2,38]. This 
means that Def has the special property that each of its Boolean functions can 
be expressed as a (possibly empty) conjunction of propositional Horn clauses. 
As with Pos, Def is assumed to be augmented with the bottom element false. 
Def can thus represent the groundness dependencies x A y, x, x ^ y, y, x ^ y, 
X ^ y, false and true but not x\J y. Suppose that di A fi = {x ^ y) and 
Ci+i = {x Ay). Then conjoining x with di A fi would be at least as strong as 
Ci+i and symmetrically conjoining y with di A fi would be at least as strong as 
ei+i. However, Def does not contain a Boolean function strictly weaker than 
both x and y, namely xV y, whose conjunction with di A fi is at least as strong 
as Ci+i. Thus setting a = x or d = y would be safe but setting d = {x V y) is 
prohibited because xy y falls outside Def. Moreover, setting a = false would 
lose an unacceptable degree of precision. A choice would thus have to be made 
between setting Ci = x and Ci = y in some arbitrary fashion, so there would be 
no clear tactic for maximising precision. 

Returning to the compound goal pi{xi), . . . ,p„(a;„), a call described by the 
mode di A {{di A fi) Cj+i) is thus sufficient to ensure that neither Pi{xi) 
nor the sub-goal pi+i{xi+i), . . . ,p„(a;„) error. Since di A {{di A fi) e*+i) = 
di A {fi 6i+i) = 6i it follows that pi{xi), . . . ,p„(a;„) will not error if its call 
is described by e^. In particular, it follows that ci describes a safe calling mode 
for the body atoms of the clause p{x) :- ask(d), tell(/),pi(xi), . . . ,p„(x„). 

The next step is to calculate g = d A {f ^ ei). The abstraction / describes 
the grounding behaviour of the Herbrand constraint added to the store prior to 
executing the body atoms. Thus (/ ^ ei) describes the weakest mode that, in 
conjunction with /, ensures that ci holds, and hence the body atoms are called 
safely. Hence dA{f ^ ei) represents the weakest demand that both satisfies the 
body atoms and the assertion d. One subtlety which relates to the abstraction 
process is that d is required to be a lower-approximation of the assertion whereas 
/ is required to be an upper-approximation of the constraint. Put another way, 
if the mode d describes the binding on the store, then the (concrete) assertion 
is satisfied, whereas if the (concrete) constraint is added to the store, then the 
store is described by the mode /. 
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Strengthening Lower Approximations The projection operator 3^ cannot 
be applied to eliminate variables in g that are not present in p{x), since this could 
potentially weaken g and thereby compromise safety. Instead a dual projection 
operator is applied which is defined Va;(/) = /' if /' G Pos otherwise ^xif) = 
false where /' = f[x false]Af[x true]. Note that although f[x false]V 
f[xi-^ true] G Pos for all / G Pos it does not follow that f[xi-^ false] A f[x 
true] G Pos for all / G Pos. For example, (x ^ false] A {x ^ y)[x 

true] = ~^y. Like 3x{f), Va,(/) eliminates a variable x from /. The fundamental 
difference is in the direction of approximation in that Va,(/) \= f \= 3x{f). Thus 
if Y are the variables that are not present in p{x), then g' = Vy((/) eliminates 
Y from g where ^ {yx...yn}{9) = (■ • ■ ^i/n (s))) whilst strengthening g. A safe 

calling mode for this particular clause is then given by g' , since if g' holds then 
g holds also. 

Dk+i will contain a call pattern {p{x),g”) and, assuming g' A g" g” , this 
is updated with {p{x),g' A g"). Thus the call patterns become progressively 
stronger on each iteration. Correctness is preserved because call patterns can be 
safely approximated from below. The space of call patterns forms a complete 
lattice which ensures that a gfp exists. In fact, because call patterns are ap- 
proximated from below, the gfp is the most precise solution, and therefore the 
desired solution. (This contrasts to the norm in logic program analysis where 
approximation is from above and a Ifp is computed). Moreover, since the space 
of call patterns is finite, termination is assured. In fact, the scheme will converge 
onto the gfp since (lower Kleene) iteration commences with the top element 
£>o = {{p{x),true) | p G 7T}. 



Greatest Fixpoint Calculation for Quicksort Under this procedure quick- 
sort generates the following Dk sequence: 



D.= 



{qs{xi,X2,X3),true)^ 
(pt (xi,X2,X3,X4,), true) 
{=<'{xi,X2),true) 
{>'{xi,X2),true) J 



y Di = 



(qs{xi,X2,X3), true) 

(pt (a;i ,X2,X3,X4), true) 
{=<fxi,X2),Xi A X2) 
{>fxi,X2),xi A X2) 



D 2 = 



D3 = 



{qs{xi,X2^X3), true) 
{pt{xi,X2,X3,Xi),X2 A (a;i V {X3 A Xi))) 
{=<’{xi,X2),Xi A X2) 
{>'{xi,X2),Xi A X2) 

{qs{xi,X2,X3),Xi) 

{pt{xi,X 2 ,X 3 ,Xi),X 2 A (xi V {X 3 A Xi))) 
{=<'{xi,X2)jXi A X2) 
{>'{xi,X2)jXi A X2) 



These calculations are non-trivial so consider how D 2 is obtained from D\ by 
applying the second (abstract) clause of pt as listed in Figure 2 - the clause with 
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head pt(Tl, M, T2, H). The following Cj and g formulae are generated from 
the demands di and the success patterns fi'. 

63 = true 

62 = ^2 A (/2 ^ 63) 

= true A ((Xs A L A H) ^ true) = true 

ei = di A (/i ^ 62 ) 

= (M A X) A ((M A X) ^ true) = M A X 

g = dA{f ^ ei) 

= true A (((Tl ^ X A Xs) A (T2 ^ X A L)) (M A X)) 

To characterise those pt(Tl, M, T2, H) calls which are safe, it is necessary 
to compute a function g' on the variables Tl, M, T2, H which, if satisfied by the 
mode of a call, ensures that g is satisfied by the mode of the call. Put another 
way, it is necessary to eliminate the variables X, Xs and L from g (those variables 
which do not occur in the head pt(Tl, M, T2, H)) to obtain a Pos function 
g' such that g holds whenever g' holds. This is accomplished by calculating 
g' = VlVxsVx( 5 ). First consider the computation of Vx((/): 

g[X false] = ((Tl ^ false A Xs) A (T2 ^ false A L)) — > (M A false) 

= (^Tl A ^T2) ^ false 
= T1 V T2 

g[X I— > true] = ((Tl ^ true A Xs) A (T2 ^ true A L)) ^ (M A true) 

= ((T1 Xs) A (T2 6 ^ L)) ^ M 

Since ^[X 1— > false] A ^f[X true] G Pos it follows that: 

Vx(5) = (((Tl Xs) A (T2 L)) ^ M) A (Tl V T2) 

(otherwise Vx(g) would be set to false). Eliminating the other variables in a 
similar way we obtain: 

Vx(g) = (((Tl <-> Xs) A (T2 L)) M) A (Tl V T2) 

VxsVx(ff) = ((T2 L) M) A (Tl V T2) 
ff' = VLVxsVx(ff) = MA(T1VT2) 

Observe that if VlVxsVx((/) holds then g holds. Thus if the mode of a call satisfies 
g' then the mode also satisfies g as required. This clause thus yields the call 
pattern {^t{x\,X 2 ,X 3 ,X 4 ),X 2 A (a;i V X 3 )). Similarly the first and third clauses 
contribute the patterns (pt(xi, X2, 2:3, 0:4), true) and (pt(a:i, X2, 0:3, 0:4), a;2A(a;iV 
0:4)). Observe also that 

true A {x 2 A (a;i V X 3 )) A {x 2 A {x\ V X 4 )) = X 2 A {x\ V {X 3 A X 4 )) 

which gives the final call pattern formula for pt(xi, a;2, X3, xa) in U2. The gfp is 
reached at D3 since D4 = D3. The gfp often expresses elaborate calling modes, 
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for example, it states that ^t{x\,X 2 ,x^,X 4 ) cannot generate an instantiation 
error (nor any predicate that it calls) if it is called with its second, third and 
fourth argument ground. This is a surprising result which suggests that the 
analysis can infer information that might be normally missed by a programmer. 

2.4 Work Related to Mode Inference 

Mode inference was partly motivated by the revival of interest in logic program- 
ming with assertions [4,56,64]. Interestingly, [56] observe that predicates are 
normally written with an expectation on the initial calling pattern, and hence 
provide an entry assertion to make the, moding say, of the top-level queries 
explicit. Mode inference gives a way of automatically synthesising entry asser- 
tions providing a provably correct way of ensuring that instantiation errors do 
not occur during program execution. 

An analysis for type inference could be constructed by refining the analysis 
presented in this section by replacing the mode domain Pos with the domain of 
directional types [1,30,40]. This domain is condensing and therefore the domain 
comes equipped with the relative pseudo-complement operator that is necessary 
for backward reasoning. Interestingly, type inference can be performed even when 
the domain is not relatively pseudo-complemented [44]. This, however, relies on 
a radically different form of fixpoint calculation and therefore this approach to 
type inference is discussed separately in section 5. 

3 Backwards Termination Inference 

The aim of termination inference is to determine conditions under which calls to 
a predicate are guaranteed to terminate [24] . Termination inference is not a new 
idea in itself; it dates back to the pioneering work of Mesnard and his colleagues 
[34,50,52,53]. Recently it has been observed, however, that termination inference 
[24] can be performed by composing backwards analysis with a standard termi- 
nation checker [10]. The elegance of this approach is that termination analysis 
can be reversed without dismantling an existing (forwards) termination analysis. 

The key advantage of (backwards) termination inference over (forwards) ter- 
mination checking is that termination inference can deduce, in a single appli- 
cation, a class of queries that lead to finite LD-derivations [24]. To illustrate 
this key idea, consider the program split listed in Figure 4. The split pred- 
icate arises in the classic mergesort algorithm where it is used to partition a 
list into sub-lists as preparation for an ordered merge [43]. For instance, the 
goal split ( [a,b,c] , LI, L2) will terminate, binding LI and L2 to [a,c] and 
[b] respectively. Correspondingly, a termination checker will ascertain that the 
call split (L, LI, L2) will terminate if L is bound to a list of fixed length 
[43]. However, a termination inference engine such as TerminWeb [24] or cTI 
[50] will deduce that split (L, LI, L2) terminates with either the first argu- 
ment bound to a closed list or both the second and third arguments bound to 
closed lists. Of course, a termination checker could be reapplied to prove that 
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split ([], [], []) true. 

split([X I Xs], [X I LI], L2) split(Xs, L2, LI). 



Fig. 4. split program expressed in Prolog 



split (L, LI, L2) will terminate under the latter condition, but inference finds 
all the termination conditions in one application. Thus termination inference can 
discover termination conditions that are not observed by one, or possibly many, 
applications of a termination checker. Note that is not due to a failing of the 
checker; it is due to the programmer failing to realise that a condition warrants 
checking. (Actually, the conditions under which termination inference truly gen- 
eralises termination checking are technical [24] and relate, among other things, 
to properties of the projection operators 3x and Va, [40].) 

Termination inference can be realised in terms of the backwards analysis 
framework of [39] that was applied, in the previous section, to the problem of 
mode inference. In fact the only conceptual difference between mode inference 
and termination inference is in the way in which assertions are calculated. Whilst 
for mode analysis assertions are direct groundness abstractions of the builtins, 
for termination inference, assertions need to be calculated by an analysis of the 
loops within the program. 

Termination analyses typically amount to showing that successive goals in 
an LD-derivation are decreasing with respect to some well-founded ordering. In 
the context of a termination checker founded on a binary clause semantics [20], 
this reduces to observing a size decrease between the arguments of the head 
and the body atom for each recursive binary clause [10]. From such a checker, a 
termination inference engine is obtained as follows: 

— Firstly, the program is abstracted with respect to a chosen norm (or possibly 
a series of norms [25]). A norm maps each Her brand term in the program 
to a linear expression that represents its size. Syntactic equations between 
Herbrand terms are replaced with linear inequations which express size re- 
lationships. The resulting program is abstract - it is a constraint program 
over the domain of linear constraints - but it is not binary; abstract clauses 
may contain more than one body atom. 

— Secondly, an abstract version of the binary clause semantics is applied [10]. 
The (concrete) binary clause semantics of [20] provides a sound basis for 
termination analysis since the set of (concrete) binary clauses it defines pre- 
cisely characterises the looping behaviour of the program. Specifically, a call 
to a given predicate will left-terminate if each corresponding recursive binary 
clause possesses a body atom that is strictly smaller than its head. Since this 
set of clauses is not finitely computable, an abstract version of binary clause 
semantics is used to compute a set of abstract binary clauses which, though 
finite, faithfully describes the set of concrete binary clauses. The linear in- 
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subset ( [] , _) : - 


subset (0, Ys) 




true . 


0 < Ys, true. 




subset ([X 1 Xs] , Ys) 


subset (1 + Xs, Ys) 




member(X, Ys) , subset(Xs, Ys) . 


0 < X, 0 < Xs, 0 < Ys, 




member (X, [X | Xs] ) 


member(X, Ys) , subset(Xs, 


Ys) . 


true . 


member (X, 1 + Xs) 




member(X, [_ | Ys] ) 


0 < X, 0 < Xs, true. 




member(X, Ys) . 


member (X, 1 + Ys) 






0 < X, 0 < Ys, member(X, 


Ys) . 


Fig. 5. subset program expressed 


in Prolog, and its list-length abstraction 



equalities in these abstract clauses capture size relationships between the 
arguments in the head and the body atom of the concrete clauses. 

~ Thirdly, combinations of ground arguments that are sufficient for termina- 
tion are derived. The crucial point is that a decrease in size is only observable 
if sufficient arguments of a call are ground. These ground argument combina- 
tions are extracted from the linear inequalities in the abstract binary clauses, 
expressed as Boolean functions, and added to the original program in the 
form of assertions. 

— Fourthly and finally, backwards mode analysis is performed on the program 
augmented with its assertions. The greatest fixpoint then yields groundness 
conditions which, if satisfied by an initial call, ensure that the call leads to 
a finite LD-derivation. 

Note that backwards termination inference can be considered to be the compo- 
sition of one black-box that infers binary clauses, with another which extracts 
assertions from the binary clauses with yet another performs mode inference. Be- 
cause of this construction, readers who wish to skip the details on approximating 
loops can progress directly onto section 3.3. 

3.1 Program Abstraction 

Termination inference will be illustrated using the subset program listed in the 
first column of Figure 5. The predicate subset (LI , L2) holds iff each element of 
the list LI occurs within the list L2. Observe that neither subset (LI, [a,b,c] ) 
nor the call subset ( [a, b , c] , L2) terminate when LI and L2 are uninstantiated. 
However, both calls will terminate (albeit possibly in failure) when LI and L2 
are ground. The challenge is to automatically derive these grounding properties 
which are sufficient to guarantee termination. 

Non-termination of logic programs is the result of infinite loops occurring 
during execution. Consequently recursive calls are the focus of termination anal- 
ysis; a logic program will terminate if the arguments of successive calls to a 
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predicate become progressively smaller with respect to a well-founded ordering. 
Thus, the notion of argument size (and more generally term size) is at the core 
of termination analyses. To measure term size, a norm is applied which maps 
a ground term to a natural number. To support program abstraction [28], the 
concept is normally lifted to terms that contain variables by defining a symbolic 
norm which maps a term to an expression over variables, non-negative integer 
constants and the functor -I-. For instance, the list-length norm is defined over 
the set of ground terms by: 




|f2|length “t” 1 if t — [tl|t2] 
0 otherwise 



whereas the symbolic list-length norm is given by: 



l^llength 



1^2 [length “t” lift — [tl|t2] 

t if t is a variable 

0 otherwise 



This symbolic norm describes the length of a list, using a variable to describe the 
variable length of an open list. For example |[X | Xs] [length = 1 + [Xs[ length = 1 + 
Xs. Non-list terms are ascribed a length of zero. The second column of Figure 5 
gives the list-length norm abstraction of subset and member in which terms are 
replaced by their sizes. The abstraction is obtained by replacing each term with 
its size. Since a norm can only map a variable to a non-negative value, extra 
inequalities are introduced to ensure that all (size) variables are non-negative. 
Observe that the resulting abstraction is a constraint program over the system 
of linear inequations. 



3.2 Least Fixpoint Calculation over Binary Clauses 

In [10] it is shown (using a semantics for call patterns similar to that of [20]) 
that a logic program is terminating iff its binary unfolding is. Informally, the 
binary unfolding of a program is the least set of binary clauses each with a 
head and body such that the head occurs as a head in the original program 
and, when the original program is called with the head as a goal, then the body 
occurs as a subsequent sub-goal in an LD-derivation. The binary unfolding is 
formally expressed in terms of the Ifp of a Tp-style operator [10,20]. Moreover, 
an abstract binary unfolding can be obtained by applying an abstraction of the 
binary unfolding operator to the abstract program [10]. 

Calculation of this abstract Ifp is complicated by the property that the do- 
main of linear inequations does not satisfy the ascending chain condition [15]. 
This property compromises the termination of the abstract Ifp calculation since 
it enables a set of abstract binary clauses to be repeatedly enlarged on successive 
iterates ad infinitum. Termination can be assured, however, by restricting the 
inequations that occur within abstract binary clauses to a finite sub-class, for ex- 
ample, the sub-class of monotonicity and equality constraints [5]. Alternatively 
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widening can be applied to enforce convergence [ 3 ]. Using the latter technique 
the following set of abstract binary unfolding is obtained: 



member(a;i, X2) 
member(a;i, 0:2) 

subset(a;i, 0:2) 
subset(a;i, 0:2) 

subset(a;i, 0:2) 



0 < A 1 < X 2 , true 

0 < A 0 < j/2 A 1 + ?/2 < 3:2 A = 2/1, 

member(2/i,2/2) 

0 < A 0 < X 2 , true 

< a^2 A 1 < xi A 0 < 2/2 A 0 < 2/1, 
member(2/i,2/2) 

2/1 + 1 < a:i A 2/2 = a^2 A 1 < A 1 < X 2 , 
subset(2/i,2/2) 



The set of abstract clauses contains at most |iTp clauses where U is the set of 
predicate symbols occurring in the program (which is assumed to include true). 
This follows because widening ensures that two abstract clauses cannot share 
the same predicate symbols in both the head and body. Note that if an abstract 
binary clause has true as its body atom then the clause does not describe a 
loop and therefore has no bearing on the termination behaviour. Such clauses 
are given above simply for completeness. 



3.3 Extracting the Assertions from the Binary Clauses 

Those abstract binary clauses that involve recursive calls are as follows: 

member(a;i, 0:2) :- subset(a;i, 0:2) :- 

0 < a;i A 0 < 2/2 A Ui + ^ ^ xi A y2 = X2 /\ 

1 + 2/2 < 3^2 A xi = 2/1, 1 < xi A 1 < a;2, 

member(2/i,2/2)- subset(2/i, 2/2)- 

Consider the abstract clause for member. The inequality 1 + 2/2 < 3:2 asserts 
that the recursive call is smaller than the previous call (as measured by the list- 
length norm). Therefore, assuming that the second argument of the original call 
to member is ground, each recursive call will operate on a strictly smaller list 
and thus terminate. Hence, although one abstract member clause approximates 
many concrete member clauses, the approximation is sufficiently precise to enable 
termination properties to be deduced. Likewise, the inequality yi + 1 < x\ for 
subset ensures that termination follows if the first argument of the initial call 
to subset is ground. 

Since termination is dependent on groundness, the inequalities in recursive 
abstract clauses induce groundness requirements that, if satisfied, assure termi- 
nation. Since the number of ground argument combinations is exponential in 
the number of arguments, inferring the optimal set of ground argument combi- 
nations is potentially expensive (though experimentation suggests the contrary 
[ 51 ]). Therefore a subset of the argument combinations may only be considered 
[ 24 ]. Once extracted, the requirements are added to the original logic program 
in the form of assertions. Figure 6 lists the subset program complete with as- 
sertions that are sufficient for termination. 
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subset (A, B) 


subset (A, B) 






tell (A = []), 


ask(A) , tell(A), 






true . 


true . 






subset (A, B) 


subset (A, B) 






tell (A = [X 1 Xs]) , 


ask(A) , tell (A ^ 


(X 


> 

X 

Ui 


member (X, B) , 


member (X, B) , 






subset (Xs, B) . 


subset (Xs, B) . 






member (X, B) 


member (A, B) 






telKB = [X 1 Xs]) , 


ask(B) , telKB ^ 


(X 


> 

X 

Ui 


true . 


true . 






member (A, B) 


member (A , B) : - 






telKB = [Y 1 Ys]), 


ask(B) , telKB ^ 


(Y 


A Ys)), 


member(A, Ys) . 


member(A, Ys) . 






Fig. 6. subset normalised and as a Pos abstraction with assertions 



3.4 Backwards Mode Analysis 

Backwards mode analysis can then be performed on the program with its asser- 
tions as specified in the previous section. Using the notation from that section, 
backwards mode analysis yields the following sequence of iterates: 

£) _ f (member(a;i, X2), true)'! ^ __ r(member(a;i, 0:2), 3:2)1 
^ [(subset(a:i, 3:2), true)] ^ ](subset(3;i,3;2),3;i) j 

D ^ /(™®“^®^(2^i>2^2),a:2) 1 

^ ](subset(a:i,a;2),a:i A 3:2) / 

The fixpoint is reached and checked in the next iteration since ZJ 3 = D 2 - The 
fixpoint specifies grounding conditions that are sufficient for termination. That 
is, subset is guaranteed to left-terminate if both of its arguments are ground. 
This is as expected, since the first argument of subset needs to be ground 
in order that its own recursive call terminates. Moreover, the second argument 
additionally needs to be ground in order that the call to member terminates. Both 
the recursive call and the call to member are required to terminate to assure that 
a call to the second clause of subset terminates. 

3.5 Work Related to Termination Inference 

Performing termination inference via backwards analysis is a comparatively 
new idea [24] but termination inference was developed by Mesnard and oth- 
ers [34,50,51,52,53] long before this connection was made. Their system, the cTI 
analyser, applied a /x-calculus solver to compute a greatest fixpoint. This seems 
to suggest that greatest fixpoints are intrinsic to the problem itself. On the other 
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hand, the termination inference analyser reported in [24] (and described in this 
section) is composed from two components: a standard termination checker [10] 
and a backwards analysis. The resulting analyser is similar to cTI; the main 
difference is its design as two existing black-box components which, according 
to [24], simplifies the formal justification and implementation. 



4 Backwards Suspension Inference 

In mode inference [39] the assertions are synthesised from the builtins. In ter- 
mination inference [24] the assertions are distilled from a separate analysis of 
the loops which occur within the program [10]. Both these analyses share the 
same backwards analysis component - a component which essentially propagates 
requirements right-to-left over sequences of goals against the control- flow. Inter- 
estingly, and perhaps surprisingly, backwards analysis can still be applied when 
the control is more loosely defined. In fact, backwards analysis is still applicable 
even when the control is specified by a delay mechanism which blocks the selec- 
tion of a sub-goal until some condition is satisfied [7]. This, arguably, is one of 
the most flexible ways of specifying control within logic programming. 

Delays have proved to be invaluable for handling negation [54], delaying non- 
linear constraints [32], enforcing termination [47], improving search and mod- 
elling concurrency [45]. However, reasoning about logic programs with delays 
is notoriously difficult and one reoccurring problem for the programmer is that 
of determining whether a given program and goal can reduce to a state which 
possesses a sub-goal that suspends indefinitely. A number of abstract interpreta- 
tion schemes [8,11,17] have therefore been proposed for verifying that a program 
and goal cannot suspend in this fashion. These analyses are essentially forwards 
in that they simulate the operational semantics tracing the execution of the 
program in the direction of the control with collections of abstract states. This 
section reviews a suspension analysis that is performed backwards by propa- 
gating requirements against the control-flow. Specifically, rather than verifying 
that a particular goal will not lead to a suspension, the analysis infers a class 
of goals that will not lead to suspension. This approach has the computational 
advantage that the programmer need not rerun the analysis for different (ab- 
stract) queries. Moreover, like the previous analyses, this suspension analysis is 
formulated as two simple bottom-up fixpoint computations. The analysis strikes 
a good balance between tractability and precision. It avoids the complexity of 
goal interleaving by exploiting reordering properties of monotonic and positive 
Boolean functions. 

Another noteworthy aspect of the analysis is that it verifies whether a logic 
program with delays can be scheduled with a local selection rule [63]. Under 
local selection, the selected atom is completely resolved, that is, those atoms 
it directly and indirectly introduces are also resolved, before any other atom is 
selected. Leftmost selection is one example of local selection. Knowledge about 
suspension within the context of local selection is useful within it own right 
[17,41]. In particular, [17] explains how various low-level optimisations, such as 
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inorder(nil, []) true. 
inorder(tree(L, V, R), I) 

append(LI, [V|RI], I), inorder(L, LI), inorder(R, RI). 

block append)-, ?, -). 
append))], X, X) true. 

append )[X I Xs], Ys, [XjZs]) append)Xs, Ys, Zs). 

Fig. 7. inorder program in expressed in Prolog with block declarations 



returning output values in registers, can be applied if goals can be scheduled 
left-to-right without suspension. Furthermore, any program that can be shown 
to be suspension-free under local selection is clearly suspension-free with a more 
general selection rule. Note, however, that the converse does not follow and 
the analysis cannot infer non-suspension if the program relies on coroutining 
techniques. 

4.1 Worked Example on Suspension Inference 

To illustrate the ideas behind suspension analysis, consider an analysis of the 
Prolog program listed in Figure 7. Declaratively, the program defines the re- 
lation that the second argument )a list) is an in-order traversal of the first 
argument )a tree). Operationally, the declaration block append)-, ?, -) 
delays )blocks) append goals until their arguments are sufficiently instantiated. 
The dashes in the first and third argument positions specify that a call to append 
is to be delayed until either its first or third argument are bound to non- variable 
terms. Thus append goals can be executed in one of two modes. The problem 
is to compute input modes which are sufficient to guarantee that any inorder 
query which satisfies the modes will not lead to a suspension under local selec- 
tion. This problem can be solved with backwards analysis. Backwards analysis 
infers requirements on the input which ensure that certain properties hold at 
)later) program points [39]. Exactly like before, the analysis is tackled via an 
abstraction step followed by a least fixpoint )lfp) and then a greatest fixpoint 
)gfp) computation. 

4.2 Program Abstraction 

Abstraction reduces to two transformations: one from a Prolog with delay pro- 
gram to a concurrent constraint programming [60] )ccp) program and another 
from the ccp program to a Pos abstraction. The Prolog program is re-written 
to a ccp program to make blocking requirements explicit in the program as ask 
constraints. More exactly, a clause of a ccp program takes the form h ask)c'), 
tell)c"), g where h is an atom, g is a conjunction of body atoms and c' and c" 
are the ask and tell constraints. The asks are guards that inspect the store and 
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inorder (T, I) 
ask(true) , 
tell(TAI), 
true . 

inorder (T, I) 
ask(true) , 

[VjRI]), tell(T<^ (LAVAR),A^ (VARI)), 
append(LI, A, I), 
inorder (L, LI), 
inorder (R, RI) . 



inorder (T, I) 
ask(true) , 

teIKT = nil, I = [] ) , 
true . 

inorder (T, I) 
ask(true) , 

teIKT = tree(L,V,R) ,A = 
append(LI, A, I), 
inorder (L, LI), 
inorder (R, RI) . 

append(L, Ys, A) 

ask(nonvar (L) V nonvar(A)), 
tell (L = [] , A = Ys) , 
true . 

append(L, Ys, A) 

ask(nonvar (L) V nonvar(A)), 
telKL = [X|Xs] , A = [XjZs] ) , 
append(Xs, Ys, Zs) . 



append(L, Ys, A) 
ask(L V A) , 
telKL A (A ^ Ys)) , 
true . 

append(L, Ys, A) 
ask(L V A) , 

telKL ^ (X A Xs), A ^ (X A Zs)) , 
append(Xs, Ys, Zs) . 



Fig. 8. inorder program expressed in ccp and as a Pos abstraction 



specify synchronisation behaviour whereas the tells are writes that update the 
store. As before, empty conjunctions of atoms are denoted by true. Unlike before, 
ask does not denote an assertion but a synchronisation requirement. Moreover, 
a conjunction of goals g is not necessarily executed left-to-right: goals can only 
be reduced with a clause when the ask constraint within the clause is satisfied. 
A goal will suspend until this is the case, hence the execution order of the sub- 
goals within a goal does not necessarily concur with the textual (left-to-right) 
ordering of these sub-goals. In this particular example, the only ask constraint 
that appears in the program is nonvar(a;) which formalises the requirement that 
X must be bound to a non-variable term. 

The second transform abstracts the ask and tell constraints with Boolean 
functions which capture instantiation dependencies. The ask constraints are ab- 
stracted from below whereas the tell constraints are abstracted from above. 
More exactly, an ask abstraction is stronger than the ask constraint - whenever 
the abstraction holds then the ask constraint is satisfied; whereas the tell ab- 
straction is weaker than the tell constraint - whenever the tell constraint holds 
then so does its abstraction. For example, the function L V A describes states 
where either L or A is ground [2] which, in turn, ensure that the ask constraint 
nonvar(L) V nonvar(A) holds. On the other hand, once the tell A = [V|RI] 
holds, then the grounding behaviour of the state (and all subsequent states) is 
described by A (V A Rl). 
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4.3 Least Fixpoint Calculation 

The least fixpoint calculation approximates the success patterns of the ccp pro- 
gram (and thus the Prolog with delays program) by mimicking the Tp operator 
[28]. A success pattern is an atom with distinct variables for arguments paired 
with a Pos formula over those variables. This is the same notion of success pat- 
tern as used in mode inference and, just as in mode inference, a success pattern 
summarises the behaviour of an atom by describing the bindings it can make. 
The Ifp of the Pos program can be computed in a finite number of iterates to 
give the following Ifp: 

(inorder(a;i,a;2),a;i ^ X2) 1 

((append(a;i,a; 2 ,a; 3 ), (a^i A X2) ^ 0:3)] 



4.4 Greatest Fixpoint Calculation 

A gfp is computed to characterise the safe call patterns of the program. A call 
pattern has the same form as a success pattern. Iteration commences with 

jj _ { (inorder(a;i, X 2 ), true) 

^ ((append(a:i,a;2,a:3),true) 

and incrementally strengthens the call pattern formulae until they are safe, that 
is, they describe queries which are guaranteed not to violate the ask constraints. 
The iterate Ui+i is computed by putting = Di and then revising by 
considering each p{x) :- ask(<i),tell(/),pi(xi), . . . ,p„(x„) in the abstract pro- 
gram and calculating a (monotonic) formula that describes input modes (if any) 
under which the atoms in the clause can be scheduled without suspension under 
local selection. A monotonic formula over set of variables X is any formula of 
the form V^^i{/\Yi) where Yi C X [18]. Let di denote a monotonic formula that 
describes the call pattern requirement for Pi{xi) in Di and let fi denote the 
success pattern formula for Pi{xi) in the Ifp (that is not necessarily monotonic). 
A new call pattern for p{x) is computed using the following algorithm: 

— Calculate e = X^^-^{di fi) that describes the grounding behaviour of 
the compound goal pi(xi), . . . ,Pn{xn). The intuition is that Pi{xi) can be 
described by di fi since if the input requirements di hold then Pi{xi) can 
be executed without suspension, hence the output fi must also hold. 

~ Compute e' = Af^idi which describes a groundness property sufficient for 
scheduling all of the goals in the compound goal without suspension. Then 
e e' describes a grounding property which, if satisfied, when the compound 
goal is called ensures the goal can be scheduled by local selection without 
suspension. 

— Calculate g = d /\ {f ^ {e ^ e')) that describes a grounding property which 
is strong enough to ensure that both the ask is satisfied and the body atoms 
can be scheduled by local selection without suspension. 




Analysing Logic Programs by Reasoning Backwards 175 



— Eliminate those variables not present in p{x), Y say, by calculating 
g' = Vy( 5 ) where y{yi...y„}{g) = Vyi(. ■ .Vy„(g)). Hence ^xif) entails / and 
g' entails g, so that a safe calling mode for this particular clause is then given 

by g'- 

— Compute a monotonic function g" that entails g' . Since g" is stronger than 
g' it follows that g” is sufficient for scheduling the compound goal by local 
selection without suspension. The function g' needs to be approximated by 
a monotonic function since the e ^ e! step relies on di being monotonic. 

— Replace the pattern {p{x),g'”) in Hi+i with {p{x),g” A g'”). 

This procedure generates the following Di sequence: 

jj _ ( (inorder(a;i, X 2 ), true) 

^ \(append(a:i,a:2,a:3),a:i V 0:3) 

^ { (inorder(a;i,a:2),a:i V a;2)'l 
^ \(append(a:i,a;2,a:3),a;i V 3:3) j 

The gfp is reached and checked in three iterations. The result asserts that a 
local selection rule exists for which inorder will not suspend if either its first 
or second arguments are ground. Indeed, observe that if the first argument is 
ground then body atoms of the second inorder clause can be scheduled as 
follows: inorder(L, LI), then inorder(R, RI), and then append(LI, A, I). 
Conversely, if the second argument is ground, then the reverse ordering is suf- 
ficient for non-suspension. These call patterns are intuitive and experimental 
evaluation [26] suggests that unexpected and counter-intuitive call patterns arise 
(almost exclusively) in buggy programs. This suggests that the analysis has a 
useful role in bug detection and program development. 



4.5 Work Related to Suspension Inference 

One of the most closely related works comes surprisingly from the compiling 
control literature and in particular the problem of generating a local selection 
rule under which a program universally terminates [34]. The technique of [34] 
builds on the termination inference method of [50] which infers initial modes for 
a query that, if satisfied, ensure that a logic program left-terminates. The chief 
advance in [34] over [50] is that it additionally infers how goals can be statically 
reordered so as to improve termination behaviour. This is performed by aug- 
menting each clause with body atoms ai, . . . , a„ with n{n— 1) Boolean variables 
bij with the interpretation that btj = 1 if precedes aj in the reordered goal 
and bij = 0 otherwise. The analysis of [50] is then adapted to include consis- 
tency constraints among the bij, for instance, bj^k A ~^bi^k In addition, 

the bi^j are used to determine whether the post-conditions of ai contribute to 
the pre-conditions of aj. Although motivated differently and realised differently 
(in terms of the Boolean ^-calculus) this work also uses Boolean functions to 
finesse the problem of enumerating the goal reorderings. 
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A demand analysis for the ccp language Janus [61] is proposed in [16] which 
determines whether or not a predicate is uni-modal. A predicate is uni-modal 
iff the argument tuple for each clause shares the same minimal pattern of in- 
stantiation necessary for reduction. The demand analysis of a predicate simply 
traverses the head and guard of each clause to determine the extent to which 
arguments have to be instantiated. Body atoms need not be considered so the 
analysis does not involve a fixpoint computation. A related paper [17] presents a 
goal-dependent (forwards) analysis that detects those ccp predicates which can 
be scheduled left-to-right without deadlock. This work is unusual in that it at- 
tempts to detect suspension- freeness for goals under leftmost selection. Although 
this approach only considers one local selection rule, it is surprisingly effective 
because of the way data often ffows left-to-right. 

5 Backwards Type Inference 

Backwards mode inference, termination inference and suspension inference anal- 
ysis of the previous sections all apply the same operator to model reversed in- 
formation flow - the relative pseudo-complement. The key idea that these anal- 
yses exploit is that if d 2 expresses a set of requirements that must hold after 
a constraint is added to the store, and di models the constraint itself, then 
di —>■ d 2 expresses the requirements that must hold on the store before the con- 
straint. Comparatively few domains possess a relative pseudo-complement and, 
arguably, the most well-known type domain that comes equipped with a relative 
pseudo-complement operator is the domain of directional types [1,30]. This sec- 
tion demonstrates that backwards analysis is still applicable to problems in pro- 
gram development even when the domain is not relatively pseudo-complemented 
or when the relative pseudo-complement is not particularly tractable [40]. The 
section focuses on the problem of inferring type signatures for predicates that are 
sufficient to ensure that the execution of the program with a query satisfying the 
inferred type signatures will be free from type errors. This problem generalises 
backwards mode inference - types are richer than modes. It also generalises type 
checking in which the programmer declares type signatures for all predicates in 
the program and a type checker verifies that the program is well-typed with re- 
spect to these type signatures, that is, these type signatures are consistent with 
the operational semantics of the program. 

The value of type inference is illustrated by returning to the quicksort pro- 
gram listed in Figure 1. A type checker would require the programmer to declare 
type signatures for qs and pt and then check if the program is well-typed with 
respect to these and the type signatures for builtin predicates < and > stipu- 
lated in the user manual. In contrast, type signature inference will infer that if 
qs is called with a list of numbers as the first argument then the execution of the 
program will not violate the type signatures of < and >; the programmer need 
not declare types for qs nor pt. Backwards type analysis gives the programmer 
the flexibility not to declare and maintain type signatures for predicates that are 
subject to frequent modifications during program development. In the extreme 
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situation, the programmer may choose to leave unspecified type signatures for all 
user-defined predicates and let the analyser to infer type signatures from builtin 
and library predicates. One application of the new analysis is automatic program 
documentation. Type signatures provide valuable information for both program 
development and maintenance [62]. Another application is in bug detection. The 
inferred type signature for a predicate can be compared with that intended by 
the programmer and any discrepancy indicates the possible existence of bugs. 

5.1 Greatest Fixpoint Calculation 

The analysis is performed by computing a greatest fixpoint. It starts by assum- 
ing that no call causes a type error and then checks this assumption by rea- 
soning backwards over all clauses. If an assertion is violated, pre-conditions are 
strengthened, and the whole process is repeated. The basic datum of the anal- 
ysis is a type constraint. A type constraint is a disjunction of conjunctive type 
constraints. A conjunctive type constraint, in turn, is a conjunction of atomic 
type constraints of the form x:t where a ; is a variable and r a type that denotes 
a set of terms closed under instantiation. Similarly to before, tell constraints dis- 
tinguish syntactic equations from assertions which are themselves indicated by 
ask constraints. The assertions specify type constraints which must be respected 
by the execution of the program. 

Each clause of the normalised program takes the form of p{x) Bi, . . . , Bk 
where each Bi is either: 

— an assertion ask(i^) where ^ is a type constraint or 

— tel I (A) where if is a syntactic equation (unification) or 

— a call to an atom q{y) where y is a vector of distinct variables. 

Unlike previously, ask and tell constraints can occur multiply within the same 
clause. A conjunction of body atoms Bi,. . . ,Bk is executed left-to-right. 

As previously, backwards analysis reduces to computing a finite sequence of 
iterates Di. Each Di is a mapping from an atom p{x) to a function that itself 
maps a type constraint (pR to another (j)^ such that the execution of p{x) in 
a state satisfying (j>L succeeds (if it does) only in a state satisfying 4>r and re- 
spects type constraints given by the assertions. The pair {p{x),(f)R) is called a 
demand whereas 4>l is a pre-condition for (p{x),(j)R). Di+i is computed from 
Di by updating the pre-condition for each demand in Di and adding new de- 
mands to Di+i if necessary. For a demand {p{x),(j)R), a type constraint (j)^ is 
computed from each clause p{x) ■- B\, ... ,Bk by computing a series tpk, ■ ■ ■ ,ipo 
of type constraints. This starts by assigning ijjk = 4>R- Then every other ipj-i is 
computed from tpj as follows: 

— If Bj = ask(())) then V'i-i is calculated by ipj-i = ipj A (p. 

— If Bj = tell(if) then 'ipj-i is computed by performing backwards abstract 
unification. Backwards abstract unification ensures that the result of unifying 
E in the context of a store satisfying ipj-i is a store satisfying %pj- Since 
the domain is not condensing, backwards unification cannot coincide with 




178 



Jacob M. Howe, Andy King, and Lunjin Lu 



the relative pseudo-complement operator. The relative pseudo-complement 
operator is unique in that it delivers the weakest abstraction which when 
combined with one given abstraction, entails another given abstraction. This 
suggests there may exist a pre-condition which is strictly weaker than ipj-i 
or strictly incomparable with V'i-i which is also sufficient for ensuring that 
ipj holds after E. Put another way, backwards unification does not come with 
the precision guarantee that characterises the relative pseudo-complement. 

— If Bj = q{y) is a call to a user-defined predicate then V’j-i is computed as 
follows: 

• Let ipj = V™ 1 where each fii is a conjunctive type constraint. 

• Apply existential quantification to project /ij onto the variables y to ob- 
tain ui. Hence vi is weaker than y,i, that is, ui holds if yn holds. Moreover, 
{q{y), vi) is a demand that constrains only variables in y. 

• If (q{y),vi) (modulo renaming) is in Di then uji = Di{{q{y),vi)); 
recall that Di is interpreted as a mapping from demands to 
pre-conditions. 

• Otherwise, u>i = true and {q{y),i'i) i-^- true is added into Ui+i, thereby 
introducing a new demand. 

• Put V'i-i = ’^i) where each vi is obtained from yi by applying 

existential quantification to project out variables in y. 

Then tpj-i is a pre-condition for {q{y),'ipj) provided that coi is a 
pre-condition for {q{y),vi) for each 1. 

Finally is computed from ipo via universal quantification by projecting onto 
the variables within p{x). As in the previous backwards analyses, this strengthens 
the pre-condition such that tpo holds if (j/^ holds. 

5.2 Worked Example on Type Inference 

To illustrate, consider the insertionsort program listed in the first column of 
Figure 9. The second column gives the program in a normalised form, decorated 
with ask and tell constraints. Note how the tests X > Y and X < Y are both 
replaced with the tell constraint X:num A Y:num where num denotes the set of 
numbers. Unlike the previous backwards analyses, the analysis is driven from an 
initial demand. The initial demand is the pair (sort(Xs, Ys), true) for which a 
pre-condition is required, hence Dq is: 

Dq = {(sort(Xs,Ys), true) i-^- true} 

The iterate Di is computed by successively updating Dq by considering each 
clause in turn. To illustrate, consider the first clause for sort where Bi = 
append(As, Cs, Xs), ..., Bq = sort(Zs,Ys). This clause has 6 body 
atoms and analysis amounts to computing ^/> 5 ,...,^o where ipQ = true. The 
analysis proceeds as follows: 

— Firstly, '0s = true is computed. Since Bq = sort(Zs,Ys) is an atom, ipQ is 
projected onto the variables {Zs,Ys}, yielding true. Then D\ is checked for 




Analysing Logic Programs by Reasoning Backwards 179 



sortCXs, Ys) 


sort(Xs, Ys) 


append(As, [X, Y Bs] , Xs) , 


append(As, Cs, Xs) , 


X > Y, 


telKCs = [X, YjBs]) , 


append(As, [Y, X|Bs] , Zs) , 


ask(X:num A Y:num) , 


sort (Zs , Ys) . 


teIKDs = [Y, XjBs]), 


sort(Xs, Xs) 


append(As, Ds, Zs) , 


order (Xs) . 


sort (Zs , Ys) . 
sort(Xs, Ys) 


append ([] , Ys, Ys) true. 


telKXs = Ys) , 


append( [XjXs] , Ys, [XjZs]) 
append(Xs, Ys, Zs) . 


order (Xs) . 
append(Xs, Ys, Zs) 


order ( [] ) : - true . 


tell (Xs = [] , Ys = Zs) . 


order ( [_] ) : - true . 


append(Xs, Ys, Zs) 


order ([X, YjXs]) 


telKXs = [XjXsl] , Zs = [XjZsl]), 


X < Y, 

order ( [YjXs] ) . 


append(Xsl, Ys, Zsl) . 

order (Xs) 

telKXs = []). 
order (Xs) 

telKXs = [J). 
order (Xs) 

telKXs = [XjXsl], Xsl = [YjYs]), 
ask(X:num A Y:num) , 
order (Xsl) . 


Fig. 9 . insertionsort expressed in Prolog and with type assertions 



the demand {BQ,true). Because (sort(Zs, Ys), true) is a variant of (sort(Xs, 
Ys), true), no new demand is added to Di and thus V'5 = -Di((sort(Zs, Ys), 
true)) = true. 

— Secondly, ip4 = true is computed. As previously = append(As, Ds, Zs) 
is an atom. Thus true is projected onto {As,Ds,Zs}, obtaining true. Un- 
like before, Di does not contain the demand {B^,true) and therefore Di is 
updated to 

jj _ ( (append(As, Ds, Zs), true) true, 

^ [ (sort(Xs, Ys), true) I--!- true 

and '04 = Hi((append(As, Ds, Zs), true)) = true. 

— Thirdly, ■03 = true is computed. Because B4 = tell(Ds = [Y, X|Bs]), back- 
wards abstract unification is applied. Since '04 = true, this requirement is 
trivially satisfied, hence 03 = true. 

— Fourthly, 02 = X:numAY:num is computed. Since B^ = ask(X:numA Y:num), 
02 = 03 A 0 where B3 = ask( 0 ). 
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— Fifthly, ipi = (X:num A Y:num) V Cs:list(num) is computed where list is 
the standard polymorphic list constructor associated with the typing rules 
list(/3) ::= [] and list(/3) ::= [/3| list (/?)]. Abstract backwards unification is ap- 
plied since B2 = tell(Cs = [X,Y|Bs]). The conjunct (X:num A Y:num) derives 
from the fact that a type constraint that holds before unification also holds 
after unification. The conjunct Cs:list(num) derives from the fact that both 
Cs and [X,Y|Bs] are of the same type after unification and Cs:list(num) im- 
plies [X, Y|Bs]:list(num), hence (X:num A Y:num). More generally, backwards 
abstract unification takes as inputs an equational constraint E and a type 
constraint i/' smd produces as output a type constraint (j) which describes 0 
whenever 1/) describes mgu{ 9 {E)) o 9 . 

— Sixthly, xpQ = true is computed. Since Bi = append(As, Cs, Xs) is an atom, 
(X:num A Y:num) is projected onto {As,Cs,Xs} yielding true. A variant 
of (append(As, Cs, Xs), true) is contained within Di. However, projecting 
{As,Cs,Xs} out of (X:num A Y:num) yields (X:num A Y:num). Thus, one 
pre-condition for (append(As, Cs, Xs), (X:num A Y:num)) is {true A (X:num A 
Y:num)) = (X:num A Y:num). Another is obtained by projecting Cs:list(num) 
onto {As,Cs,Xs} to obtain Cs:list(num), hence D\ is updated with the new 
demand: 

{ (append(As,Ds,Zs),trMe) i-^- true, '1 
(append(As, Cs, Xs), Cs:list(num)) true, > 

{sort{Xs, Ys), true) 1-^ true J 

Because Ui((append(As, Cs, Xs), Cs:list(num))) = true, the other pre-condi- 
tion for (append(As, Cs, Xs), Cs:list(num)) is true. Therefore 'ipo = (X:num A 
Y:num) V true = true. 



Processing the second clause of sort gives the same pre-condition true and 
introduces one more demand (order(Xs), true). Therefore 

{ (append(As,Ds,Zs),true) i-^- true, 

(append(As, Cs, Xs), Cs:list(num)) i-^- true, 

(order(Xs), true) i-^- true, 

(sort(Xs, Ys), true) 1— > true 



Omitting details of the remaining computation, the gfp is reached at U5 with 



D5 = 



(append(Xs, Ys, Zs), true) i-^- true, 
(append(Xs, Ys, Zs), Zs:list(num)) 1-^ Zs:list(num), 
(append(Xs, Ys, Zs), Ys:list(num)) 1— > 

Ys:list(num) V Zs:list(num), 
(order(Xs), true) 1-^ Xs:list(num), 
(sort(Xs, Ys), true) 1-^ Xs:list(num) 



The gfp asserts that sort cannot generate a type error (nor any predicate it 
subsequently calls) if it is called with a list of numbers as its first argument. It 
also states that order will not generate a type error if it called with a list of 
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numbers. Interestingly, it also asserts that calling append with its third argument 
instantiated to a list of numbers ensures that its second argument is instantiated 
to a list of numbers. 



5.3 Work Related to Type Inference 

Type analysis can be performed either with or without type definitions provided 
by the programmer. The former are easy for the programmer to understand 
whereas the latter are useful in compiler optimisation but can be more diffi- 
cult for the programmer to interpret. If type definitions are not given by the 
programmer, then the analysis has to infer both the type definitions and the 
type descriptions for the program components. Traditionally unary regular logic 
programs [66] and type graphs [13] have been applied to this class of problem, 
though modern set-based techniques founded on non-deterministic finite tree 
automata offer a number of advantages [23]. 

Alternatively, if type definitions are supplied by the programmer, then the 
analysis need only infer type descriptions from the type constructors for the 
program components. In this class of problem of particular note is the work on 
formulating type dependency domains with ACI-unification [9] since the result- 
ing domains condense. Directional type analysis [1,59] is likewise performed with 
type definitions provided by the programmer. A directional type p{x) : a ^ t 
indicates that if p{x) is called with x being of type cr then x is of type r upon 
the success of p{x). Aiken and Lakshman [1] provide a procedure for checking 
if a program is well-typed with respect to a given set of monomorphic direc- 
tional types, whereas Rychlikowski and Truderung [59] provide type checking 
and inference algorithms for polymorphic types. 

All the above type analyses propagate type information in the direction of 
program execution and compute upper approximations to the set of reachable 
program stores. In contrast, the backwards type analysis reviewed in this section 
propagates type information in the reverse direction of program execution and 
computes lower approximations to the set of program stores from which the 
execution will not violate any type assertions. 



6 Directions for Research on Backwards Analysis 

6.1 Backwards Analysis and Module Interaction 

When reasoning about module interaction it can be advantageous to reverse the 
traditional deductive approach to abstract interpretation that is based on the 
abstract unfolding of abstract goals. In particular, [27] shows how abduction 
and abstraction can be combined to compute those properties that one module 
must satisfy to ensure that its composition with another fulfils certain require- 
ments. Abductive analysis can, for example, determine how an optimisation in 
one module depends on a predicate defined in another module. Abductive anal- 
ysis is related to backwards analysis since abduction is the inverse of deduction 
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in much the same way that relative pseudo-complement is the reverse of con- 
junction. This suggests that the relationship between backwards analysis and 
abductive analysis warrants further investigation. 

6.2 Backwards Analysis and Unfolding 

Automatic program specialisation is a reoccurring theme in logic program de- 
velopment and one important aspect of this is the control of polyvariance [58]. 
Too much polyvariance (too many versions of a predicate) can lead to code bloat 
whereas too little polyvariance (too few versions of a predicate) can impede pro- 
gram specialisation and thereby efficiency. Surprisingly few works have addressed 
the problem of relating poly variance to the ensuing optimisations [58], but recent 
work [49] has suggested that backwards analysis can be applied to control poly- 
variance by inferring specialisation conditions. Backwards analysis then becomes 
a pre-processing step that precedes the goal-dependent analysis and determines 
the degree of unfolding. Specifically, if the specialisation conditions are satisfied 
by an (abstract) call in a goal-dependent analysis then the call will possibly lead 
to valuable optimisations, and therefore it should not be merged with calls that 
lead to a lower level of optimisation. The backwards analysis in effect provides a 
convenient separation of concerns in that it enables version generation decisions 
to be made prior to applying top-down analysis. This work generalises and re- 
fines earlier work on compile-time garbage collection [48] that presents a kind of 
ad hoc backwards analysis for deriving reuse conditions for Mercury [62]. These 
works, and in particular [49], show how backwards analysis can provide a useful 
separation of concerns: the backwards analysis infers specialisation conditions 
which are later used in version control. This is reminiscent of the separation of 
control from unfolding that arises in off-line binding-time analysis [6]. In fact 
one promising direction for research would be to investigate how termination 
inference can be adapted to infer conditions under which loops can be partially 
unfolded. 



6.3 Backwards Analysis and Hoare Logic 

Pedreschi and Ruggieri [55] develop a calculus of weakest pre-conditions and 
weakest liberal pre-conditions, the latter of which is essentially a reformulation 
of Hoare’s logic. Weakest liberal pre-conditions are characterised as the greatest 
fixpoint of a co-continuous operator on the space of interpretations. The work 
is motivated by, among other things, the desire to infer the absence of ill-typed 
arithmetic. Interestingly, it has been recently shown [40] that backwards analysis 
not only infers sufficient pre-conditions but the weakest pre-conditions. On the 
practical side, it means that backwards analysis need not be applied if forwards 
analysis cannot verify that a given query satisfies the assertions. Conversely, if an 
initial query is not inferred by backwards analysis, then it follows that forwards 
analysis cannot infer that the query satisfies the assertions. More generally, the 
expressive power of any backwards analysis needs to be compared against that 
of the forwards analysis that it attempts to reverse. 
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6.4 Backwards Analysis and Domain Refinement 

Recent work in domain refinement [29] has shown that the problem of minimally 
enriching an abstract domain to make it condense reduces to the problem of 
making the domain complete with respect to unification. Specifically, the work 
shows that unification coincides with multiplicative conjunction in a quantale of 
(idempotent) substitutions and that elements in a complete refined (condensing) 
abstract domain can be expressed in terms of linear logic. The significance of 
this work for backwards analysis, is that it provides a pathway for synthesising 
condensing domains that are not necessarily downward-closed. This suggests 
that the framework of [39] needs to be revised to accommodate these domains. 

6.5 Backwards Analysis and Transformation 

Very recently Gallagher [22] has proposed program transformation as a tactic 
for realising backwards analysis in terms of forwards analysis. Assertions are 
realised with a meta-predicate d(G, P) which expresses the relationship between 
an initial goal G and a property P to be checked at some program point. The 
meta-predicate d{G,P) holds if there is a derivation starting from G leading to 
the program point. The transformed program defining the predicate d can be seen 
as a realisation of the resultants semantics [21]. Backwards analysis is performed 
by examining the meaning of d, which can be approximated using a standard 
forwards analysis, to deduce goals G that imply that the property P holds. 
This work is both promising and intriguing because it finesses the requirement 
of calculating a greatest fixpoint. One interesting line of enquiry would be to 
compare the expressive power of transformation - the pre-conditions its infers - 
against those deduced via a bespoke backwards analysis framework [39,44]. 

7 Concluding Discussion 

This paper has shown how four classic program analysis and program develop- 
ment problems can be reversed. Reversal is a laudable goal in program analysis 
because it transforms a goal-dependent, checking problem into a goal-independ- 
ent, inference problem; the latter being more general than the former. Arguably 
the greatest strength of backwards analysis is its ease of automation: backwards 
analyses can be surprisingly simple to implement and efficient to apply, and goal- 
independence means that it can be applied without any programmer interaction. 
Programmers merely have to interpret the inferred results and inspect the pro- 
gram if the results do not match their expectations. Thus, although backwards 
analysis is not yet a mainstream technology in the analysis of logic programs, its 
benefits need to be carefully weighed when a particular program development 
problem is being considered. 

Backwards analysis is a modern approach to the analysis of logic programs in 
the sense that it relies on ideas that have been developed comparatively recently 
within the context of domain refinement. Backwards analysis thus illustrates the 
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value of foundational work in logic program development. It also demonstrates 
the benefits of developing programs within the context of logic programming: 
the elegance of the underlying semantics manifests itself in the simplicity of the 
analyses. In fact it is fair to say that if we have seen slightly further in program 
development, it is only because we stand on the shoulders of those who have 
developed the underpinning semantics and abstract interpretation techniques. 
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Abstract. In this work, we develop a binding-time analysis for the 
logic programming language Mercury. We introduce a precise domain 
of binding-times, based on the type information available in Mercury 
programs, that allows the analyser to reason with partially static data 
structures. The analysis is polyvariant, and deals with the module struc- 
ture and higher-order capabilities of Mercury programs. 



1 Introduction 

Program specialisation is a technique that transforms a program into another 
program, by precomputing some of its operations. Assume we have a program 
P of which the input can be divided in two parts, say s and d. If one of the 
input parts, say s, is known at some point in the computation, we can specialise 
P with respect to the available input s. This specialisation process comprises 
performing those computations of P that depend only on s, and recording their 
results in a new program, together with the code for those computations that 
could not be performed (because they rely on the input part d - unknown at this 
point in the computation). The result of the specialisation is a new program, Pg 
that computes, when provided with the remaining input part d, the same result 
as P does when provided with the complete input s + d. Comprising a mixture 
of program evaluation and code generation, the program specialisation process 
is also often referred to by the names partial evaluation, mixed computation or 
staged computation. 

Staging the computations of a program can be useful (usually in terms of 
efficiency) when different parts of a program’s input become known at different 
times during the computation. The best benefit can be obtained when a single 
program must be run a number of times while a part of its input remains constant 
over the different runs. In this case, the program can first be specialised with 
respect to the constant part of the input, while afterwards the resulting program 
can be run a number of times, once for each of the remaining (different) input 
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parts. In such a staged approach, the computations that depend only on the 
constant input part are performed only once - during specialisation. In the non- 
staged approach, all computations - including those depending on the constant 
part - are performed in every run of the program. 

When using program specialisation to stage the computations of a program, 
the basic problem is deciding what computations can be safely performed during 
the specialisation process. The driving force behind this decision is twofold. 
Firstly, the specialisation process itself must terminate; that is, the specialiser 
must not to get into a loop when evaluating a sequence of computations from the 
program that is to be specialised. Secondly, the obtained degree of specialisation 
should be “as good as possible”, meaning that a fair amount of computations 
that can be performed during specialisation are effectively performed during 
specialisation. 

The key factor determining whether a computation can be performed during 
specialisation is the fact whether enough input values are available to compute a 
result. If that is the case, the specialiser can perform the computation; if not, it 
should generate code to perform this computation at a later stage. Binding-time 
analysis is a static analysis that, given the program and a description about the 
available partial input with respect to which the program will be specialised, 
computes for every statement in the program what input values will be known 
when that statement is reached during specialisation. In addition, the analysis 
computes — according to some control strategy — whether or not the statement 
should be evaluated during specialisation. 

Once the program P and its available partial input s has been analysed 
by binding-time analysis, specialisation of P with respect to s boils down to 
evaluating those statements in P that are annotated as such by the binding- 
time analysis. This specialisation technique is called ojfline, the reason being that 
most of the control decisions about what statements should be evaluated have 
been taken by the binding-time analysis. This contrasts with the so-called online 
specialisation technique in which the program to be specialised is not analysed 
by any binding-time analysis, but is directly evaluated with respect to its partial 
input under the supervision of a control system that decides - for every statement 
under consideration ~ on the fly whether or not it can safely be evaluated. Both 
approaches towards specialisation have their advantages and disadvantages. In 
this work, we concentrate on ojfline specialisation and construct a binding-time 
analysis for the logic programming language Mercury. 



1.1 Binding-Time Analysis and Logic Programming 

Using binding-time analysis to control the behaviour of the specialisation has 
been thoroughly investigated in a number of programming paradigms. Breaking 
work on offline program specialisation of imperative languages include C-mix by 
Andersen [1] and more recently Tempo [10,20] by Consel and his group. In the 
context of functional language specialisation, most work focusing on binding- 
time analysis and offline specialisation was originally motivated by the desire to 
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achieve better self-application [13,24]. Whereas initial analysis dealt with first- 
order languages [24], more recently developed analyses deal with higher-order 
aspects [15,4], polymorphism [32,19] and partially static data structures [28]. 

In the field of logic programming, however, only little attention has been 
paid to offline program specialisation. Known exceptions are logimix [33] and 
LOGEN [25] that develop different approaches to offline program specialisation 
for Prolog. Both cited works, however, lack an automatic binding-time analysis 
and rely on the user to provide the specialiser with suitable annotations of the 
program. To the best of our knowledge, the only attempt to construct an auto- 
matic binding-time analysis for logic programming is [6] and our own work about 
which we report in [30]. The approach of [6] is particular, in the sense that it 
obtains the required annotations not by analysing the subject program directly 
but rather by analysing the behaviour of an online program specialiser on the 
subject program. Although conceptually interesting, the latter approach is overly 
conservative and restricts the number of computations that can be performed 
during specialisation. Indeed, [6] decides whether to unfold a call or not based on 
the original program, not taking current annotations into account. This means 
that a call can either be completely unfolded or not at all. The binding-time 
analysis first described in [50] and employed in [30] is also particular in the sense 
that it obtains its annotations by repeatedly applying an automatic termination 
analysis. If the termination analysis identifies a call as possibly non-terminating, 
that call is marked such that it will not be reduced by the specialiser. Then 
the termination analysis is rerun to prove termination of the program under the 
assumption that each call that is marked as non-reducible is not evaluated. The 
process is repeated until termination of the (annotated) program can be proven. 

Both the approach of [6] and [30] have been designed towards dealing with un- 
typed and unmoded logic programming languages. The fact that most logic pro- 
gramming languages are untyped makes it harder to represent the availability of 
partial input in a sufficiently precise way during the analysis. More importantly, 
the lack of control flow information in the program makes it very difficult to ap- 
proximate the data flow in a sufficiently precise way and renders the derivation 
of a binding-time analysis by ’’classic” abstract interpretation techniques not 
straightforward, hence the approaches of [6] and [30]. In this work, we construct 
a completely automatic binding-time analysis for the recently introduced logic 
programming language Mercury. Being a strongly typed and moded language. 
Mercury lifts the obstacles encountered in more traditional logic programming 
languages and allows to construct a “traditional” binding-time analysis along 
the lines of [15,23] based on data flow analysis. However, the more involved 
data- and control flow features - inherent to a logic programming language - 
render the derivation of an automatic binding-time analysis a daunting and not 
straightforward task. 

1.2 Mercury 

The design of Mercury was started in October 1993 by researchers at the Uni- 
versity of Melbourne. While logic programming languages had been around for 
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quite some time, no one seemed to fully realise the theoretical advantages such 
a language would have over more traditional, imperative languages. These ad- 
vantages are widely known, and are summarised for example in [42]: a higher 
level of expressivity (enabling the programmer to concentrate on what has to be 
done rather than on how to do it), the availability of a useful formal semantics 
(required for the - relatively - straightforward design of analysis and transfor- 
mation tools), a semantics that is independent of any order of evaluation (useful 
for parallelising the code), and a potential for declarative debugging [31]. While 
a language like Prolog does offer some of these advantages, others are destroyed 
by the impure features of the language. 

The main objective of the Mercury designers was to create a logic program- 
ming language that would be pure and useful for the implementation of a large 
number of real-world applications. To achieve this goal, the main design objec- 
tives of Mercury can be summarised as follows [42] : Support for the creation of 
reliable programs. This involves a language that allows the compiler to detect 
some classes of bugs. Support for programming in teams. Large software sys- 
tems are usually build by a number of programmers. The language must provide 
good support for creating a single application from multiple parts that are build 
(sometimes in isolation) by different programmers. These two objectives form a 
major departure from Prolog which, at the time, had basically no support for 
programming in the large, and which does not allow a lot of type-, mode- and 
determinism errors to be caught at compile-time. Another important objective 
was support for the creation of efficient programs. The compiler had to pro- 
duce code whose performance is competitive with that produced by compilers 
of other languages. To meet these design objectives. Mercury was fitted with a 
strong system of type-, mode- and determinism declarations. Besides providing 
the programmer with some valuable documentation, these declarations enable 
the compiler to check the internal consistency of the program and to spot a 
substantial number of bugs that would go unnoticed in declaration free code 
submitted to a Prolog compiler. Also, the availability of declarations allows to 
adapt the evaluation order of the body atoms in a predicate and provides as 
such the basis for an efficient execution mechanism of the language [11,41,43]. 
Mercury is equipped with a modern module system that enables to hide some 
data definitions and to encapsulate both data and code, and provides as such 
support for programming-in-the-large activities. 

1.3 Structure of the Paper 

The remainder of this paper is organised as follows. In the following section, we 
introduce a domain of binding-times that is based on the type information avail- 
able in Mercury programs. Next, in Section 3, we introduce a 2-phase binding- 
time analysis for a first-order subset of Mercury. The first part of the analysis 
performs a symbolic data flow analysis that - being call-independent - can be 
performed for each module in isolation, bottom-up over the module hierarchy. 
The second phase of the analysis, which computes the actual annotations, is 
call-dependent by nature and relies on the result of the symbolic analysis for all 
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modules involved. In Section 4, we lift the first-order restriction and enhance the 
analysis such that it computes and propagates closure information throughout 
the program that is being analysed. In Section 5, we work through an example 
and discuss to what extent our method is also applicable to typed Prolog pro- 
grams. We conclude this paper in Section 6 with a discussion of our binding-time 
analysis and its relation with existing work in the literature. 

2 A Domain of Binding-Times 

Binding-time analysis can be seen as an application of abstract interpretation 
over a domain of binding-times. A binding-time abstracts a value by specifying 
at what time during a 2-stage computation^ the value becomes known. In their 
most basic form, the binding-time of a value is either static or dynamic, denoting 
a value that is known early, during specialisation, or late, during evaluation of 
the residual program, respectively. 

It is recognised [23] that for a logic programming language, approximating 
values by either static or dynamic is too coarse grained in general. Indeed, most 
logic programs use a lot of structured data, where data values are represented 
by structured terms. Consequently, the input to the specialiser usually consists 
of a partially instantiated term: a term that is less instantiated than it would 
be at run-time. Approximating a partially instantiated term by dynamic usually 
results in too much information loss, possibly resulting in missed specialisation 
opportunities. Therefore, we use the structural information from the type system 
of Mercury to represent more detailed binding-times, capable of distinguishing 
between the computation stages in which parts of a value (according to that 
value’s type) become known. 

In what follows, we formally define the notions of type, type definition, type 
trees and type graphs, which we wil use later on as the basis of our abstract 
domain. Our formalisation is mainly based on [47,48] and [46], but similar notions 
and definitions can be found in related work on program analysis involving types, 
like e.g. [39,22,45,38,37]. Mercury’s type system is based on a polymorphic many- 
sorted logic, and corresponds to the Mycroft-0‘Keefe type system [34]. Basically, 
the types are discriminated union types and support parametric polymorphism: 
a type definition can be parametrised with some type variables, as the following 
example in Mercury syntax shows. 

Example 1. type list(T) > [] ; [T I list(T)]. 

The above defines a polymorphic type list(T): it defines values of this type to 
be terms that are either [] (the empty list) or of the form [A I B] where A is a 
value of type T and B is a value of type list (T) . 

Formally, if we denote with Sj- the set of type constructors and with Vq- 
the set of type variables of a language £, the set of types associated to C is 

^ Generalisations exist in which computations are staged over more than 2 stages 
(see e.g. [14]). In this work, we focus on a traditional 2-stage process, dividing the 
computations in a program over specialisation-time versus run-time. 
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represented by T{Sr, Vr)] that is the set of terms that can be constructed from 
Sr and Vr- A type containing variables is said to be polymorphic, otherwise it 
is a monomorphic type. A type substitution is a substitution from type variables 
to types. The application of a type substitution to a polymorphic type results 
in a new type, which is an instance of the original type. 

As usual, the set of program values is denoted by T(V, A); that is the set of 
terms that can be constructed from a set A of function symbols and a set V of 
program variables. 

The relation between a type and the values (terms) that constitute the type 
is made explicit by a type definition that consists of a number of type rules, 
one for every type constructor. Example 1 shows the type rule associated to the 
list/1 type constructor. Formally, a type rule is defined as follows: 

Definition 1 (type rule). The type rule associated to a type constructor h / n G 
Ar is a definition of the form 

h{T) / i ( ti ) ; ... ; fk{Tk)- 

where T is a sequence of n type variables from Vr and for l<i<k, fi/m€S 
with Ti a sequence of m types from T{Er,Vr) and all of the type variables 
occurring in the right hand side occur in the left hand side as well. The function 
symbols {fi, . . . , fk} are said to be associated with the type constructor h. A 
finite set of type rules is called a type definition. 

Given a type substitution, we define the notion of an instance of a type rule 
in a straightforward way. In theory, every type (constructor) can be defined by a 
type rule as above. In practice, however, it is useful to have some types builtin in 
the system. For Mercury, the types int, float, char, string are builtin types 
whose denotation is predefined and is the set of integers, floating point numbers, 
characters and strings respectively. A type is called atomic if it is defined by a 
set of zero-arity function symbols {fi, . . . , fk}. 

Mercury is a statically typed language, in which the (possibly polymorphic) 
type of every term occurring in the program text is known at compile-time. In 
what follows, we use the type definition to construct, for every type occurring 
in the program, a finite description of the structure that values belonging to the 
denotation of a particular type can take. The relevance of such a description is 
in the fact that it can be used to abstract the values belonging to the denotation 
of the type according to their structure. This allows the construction of a precise 
abstract domain for program analysis, in particular binding-time analysis. 

To extract a structural description of a type from a type definition, we intro- 
duce the notion of a type-path being a sequence of functor/ argument position 
pairs that is meant to denote a path through the type definition from a type 
to an occurrence of one of its subtypes. In fact, a type itself can be represented 
as a (possibly infinite) set of such paths, one for every path from the type that 
is being defined to some subtype occurring at a particular position within some 
term belonging to the denotation of that type. More formally, we denote the 
set of all such sequences over A x N by TPath. The empty sequence is denoted 
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by 0, and given (5 , 7 G TPath, we denote with S-j the sequence obtained by 
concatenating 7 to (5. A type tree for a particular type can then be defined as 
follows: 



Definition 2 (type tree). Given a type t G T{Sr,Vr), the type tree of t, 
denoted by Lr, is a set of sequences from TPath and is recursively defined as: 

-{) € Ct _ _ 

- if T = h{T)9 and h{T) /i(ri); . . . ; fk{Tk) is a type rule then {ifi,j))-S G 
Ct where (i) i & {1 .. . k}, (ii) fi has arity m in E, (Hi) j G {1 . . . m}, (iv) 
Tij denotes the j-th type in Ti, and (v) S G L(n.)s- 

Note that the type tree of an atomic type is {()} as a term belonging to an 
atomic type does not have any subterms. Likewise, also the type tree of a type 
variable T is defined as Ct = {()}• 



Example 2. Reconsider the type list{T) from Example 1. As £t = {()}j the type 
tree of list{T) is the infinite set of type paths 



^list{T) 



1 

(([|], 1 )) 

(([|], 2 )) 

(([|], 2 ),([|], 1 )) 

(([|]. 2 ),([|], 2 )) 

(([|], 2 ),([|], 2 ),([|], 1 )) 

(([|], 2 ),([|], 2 ),([|], 2 )) 

(([|], 2 ),([|], 2 ),([|], 2 ),([|], 1 )) 



The general idea now is to define, for any type r, a finite approximation of Cr 
that provides a good characterisation of the structure of terms of type r. First 
we introduce the following notation that formally defines the (sub)type that is 
identified by a type-path within another type. 

Definition 3 (type selected by type-path). Let t = h{T)9 be a type and S 
a path in Cr ■ If 5 = () then = t. Otherwise, 6 has the form ((/, i)). 7 , the type 
rule for h{T) has in the right-hand side an alternative of the form /(ri^ , . . . , . ) 

and . 

Note that a type path S G Cr can also be used to identify a particular subterm 
in a term t : r, if it exists. Indeed, if <5 G TPath is of the form 6 = ((/, i)).^ and 
t = /(ti, ■ ■ ■ ,tn) we define t^ = tj. 

Example 3. If r = list{T) we have for example that 

= &t(T),r«[ll’i)> = T and r«[ll.2)> = ^(([l], 2 ).([|]. 2 )> = 

Similarily for a term t = [1,2] we have for example that 

i() = [l, 2 ],t<([l].i)> = land t<([ll. 2 )([ll.i)>= 2 . 
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In what follows, we will use the notion of a type-graph as a finite approx- 
imation of a possibly infinite type-tree. Therefore, we introduce the following 
equivalence relation on the paths in a type tree Lt We define = (in Lr) as the 
least transitive relation such that for any (5, a G Hr'- if = cr.y and = r“ 
then a = S. Informally, two type paths in a type tree are equivalent if either 
one of the paths is an extension of the other while both identify the same type, 
or the paths share a common initial subpath of the same type as both paths in 
Cr- In what follows, we restrict our attention to (possibly polymorphic) types 
that are not defined in terms of a strict instance of itself. That is, we assume for 
any type r and S € Ct that there doesn’t exist a type substitution 9 such that 
= t9. This is a natural condition and is related to the polymorphism disci- 
pline of definitional genericity [27] . For any such type t, the equivalence relation 
= partitions the (possibly infinite set) into a finite number of equivalence 
classes. For any 6 G £r, the equivalence class of S is defined as 

[<5] = {7 G I (5 = 7 }. 

The least element of an equivalence class [<5] exists and is defined as follows. 

[i5] = q; G [i5] such that V/3 G [i5] : /? = a. 7 for some 7 G TPath 

Next, we define, for a type t, its type graph as the finite set of minimal elements 
of the equivalence classes of £r- 

Definition 4 (type-graph). For a type r G T{Er,Vr), we denote t’s type 
graph by £= which is defined as 

£? = {FIl 5 G £r}. 

A type graph £= provides a finite approximation of the structure of terms of 
type t: every path in £= abstracts a number of subterms of the term according 
to their type and position in the term. For the list{T) type from above, = 

{()> (([|]) !))}• TI 16 path 0 represents all subterms of type list{T) in a term of 
type list{T), whereas (([]],!)) represents all subterms of type T occurring in 
the first argument position of a functor [[]. In other words, () can be seen as 
identifying the skeleton of the list, whereas (([[], 1)) as identifying the elements 
of the list. Note that as our notions of type-tree and type-graph describe the 
possible positions of subterms in terms of a particular type, they do not contain 
the zero-arity functors that possibly belong to the definition of the type. As such, 
our notions differ from more classic definions of type-trees and type-graphs like 
e.g. [22] or [39]. Also note that due to the particular definition of =, two subterms 
of a same type are not necessarily abstracted by the same node in £=. This is 
the case when £r contains two type paths identifying the same type without 
them being equivalent, as in the next example. 

Example 4- Consider the type pairfiT) defined as 

pair{T) — >{T-T). 
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A term of the type pair{T) is a term {A — B) where A and B are terms of type 
T. For T = pair(T), 



typetrecr = L~ = I ((-), 1) > 

l((-), 2 )J 

Although ((— ),1) and ((— ),2) identify subterms of the same type T, they are 
not equivalent according to the definition of equivalence. 

The ability to distinguish between two occurrences of the same type in C~ 
allows a characterisation of terms of type r with a finer granularity than with 
type based analyses [51,7,26]. This is illustrated with Example 4. A type based 
analysis places the two components of a pair in the same equivalence class as 
((— ), 1) and ((— ), 2) select nodes of the same type. We do not and can calculate 
different binding times for them. 

Now, one can obtain an abstract characterisation of terms of type r, based 
on the structure of the term (or at least the type it belongs to), by associating 
an abstract value to each of the paths in C~ . For binding-time analysis, we are 
interested in the time a (part of a) value becomes known in the computation 
process. We use the abstract values B = {static, dynamic}, static denotes that 
the binding certainly occurs at specialisation time; dynamic that it is not known 
when (and in case of logic programs “if”) the binding occurs. A binding-time 
associates a value from B to each of the paths in a type graph. 

Definition 5 (binding-time). A binding-time for a type t € T{Eq-,Vr) is a 
function 

a-.cf^B 

such that yS G dom{/3) holds that /3{S) = dynamic implies that /3{S') = dynamic 
for all S' G dom{!3) with S' = S.j for some 7 G TPath. The set of all binding- 
times (independent of the type) is denoted by BT. 

The relation between terms and the binding-times that approximate them is 
given by the following abstraction function. 

Definition 6 (binding-time abstraction). The binding-time abstraction is 
a function a : T{E,V) 1 -^ BT and is defined as follows: 



II 




S G C~ and v = dynamic if 30 and a subterm t^ in tO ) 
such that t^ is a variable and S = S' / 


1 


1 


V = static otherwise ) 



If a term t : r contains a subterm t^ that is a variable, then the binding-time 
abstraction associates the value dynamic to the path in £= that identifies this 
subterm and to all its extensions in £= . 
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Example 5. Given the following terms of type list{T) as defined in Example 1, 
their binding-time abstraction is: 

a(D) = {((): static), ((([|], 1)), static)} 
a{[Xi,X 2 ] = {((), static), ((([|], 1)), dynamic)} 
a{X) = {((), dynamic), ((([|], 1)), dynamic)} 

<^{[^\Y]) = {((), dynamic), ((([|], 1)), dynamic)} 

Since the term [] does not contain any variable, it is abstracted by a binding- 
time specifying that the list’s skeleton as well as its elements are static. A term 
[Ai, A 2 ] is approximated by a binding-time specifying that the list’s skeleton is 
static, but its elements are dynamic. A variable is abstracted by a binding-time 
specifying that the list’s skeleton as well as its elements are dynamic. Also a term 
[A|F] is approximated by a binding-time stating that its list skeleton as well as 
its elements are dynamic due to the presence of the variable subterm Y : list{T). 

The following example shows why, if the value dynamic is associated to a path 
i5 in a binding-time for a type r, dynamic is also associated to all extensions of 
(5 in £=. 

Example 6. Consider a type definition for a tree of integers: 

inttree > nil ; t(int, inttree, inttree) . 

The type graph of r = inttree, £= contains only two paths: () denoting the tree’s 
skeleton, and (t, 1) denoting the integer elements in the tree. We have 

a{t{0, X,t{l, nil, nil))) = {{{), dynamic), {{t,l), dynamic)}. 

Although all subterms of type int in the term t(0, X,t{l, nil, nil)) are non- variable 
terms, we cannot abstract them to static. Indeed, the variable X in the term, 
being of type inttree, possibly represents some unknown integer elements. 

To make our approximations suitable for a binding-time analysis, we define a 
partial order relation on BE : 

Definition 7 (covers). Let (3 and (}' G BE such that dom{f3) C dom{(}') or 
dom{P') C dom{(3). We say that (3 covers f3' , denoted by (3 Y (3' if and only if 
f3'{S) = dynamic (3{6) = dynamic holds for all S G dom{(3) n dom{l3'). 

If a binding-time (3 covers another binding-time (3' , then (3 is “at least as 
dynamic” as f3' . Note that the relationship between dom{(3) and dom{(3') im- 
plies that the covers relation is only defined between two binding-times that are 
derived from types r and t' such that either r is an instance of t' or t' is an 
instance of t. 

Example 7. Recall the binding-times obtained by abstracting the terms in Ex- 
ample 5. We have that 



a{X)Ya{[X^,X2])Ya{[]) 
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In what follows, we extend the notion of the ^ relation to include the elements 
{T,_L} such that T ^ /3 and /3 ^ _L for all P € BT. If we denote with 
the set BT^ = BT U {T,_L}, {BT^ ,>) forms a complete lattice. Wherever 
appropriate, we use _L and T to denote, for a particular type, a binding-time in 
which all paths are mapped to static, respectively a binding-time in which all 
paths are mapped to dynamic. Occasionally we will also call such binding-times 
completely static and completely dynamic, respectively. 

We conclude this section by introducing some more notation. First, if P 
denotes a binding-time for a type t and <5 € dom{P), then P^ denotes the binding- 
time for a type that is obtained as follows: 

= { (7,/3(['J-7]))|7 G } . 

In other words, if /? = a{t) then P^ = a{t^). Finally, let r, ri, . . . ,r„ be types 
and f G T such that f{ti : ri, . . . : r„) is a term in the denotation of r. If 

Pi, . . . , Pn are binding-times for the types ri , . . . , r„, we denote with f{Pi , . . . , Pn) 
the least dynamic binding-time for type r such that /3h(/.*)>] ^ for all i. 

3 A Modular Binding-Time Analysis for Mercury 

In what follows, we develop a polyvariant binding-time analysis. The final output 
of the analysis is an annotated program in which each of the original procedures 
may occur in several annotated versions, depending on the binding-times of the 
(input) arguments with respect to which the procedure was called. Each such 
version contains the binding-times of the local variables and output arguments 
as well as instructions stating for each subgoal of the procedure’s body whether 
or not it should be evaluated during specialisation. Correctness of the analysis 
ensures that if a particular call p{t\, . . . ,tn) occurs during specialisation, the 
analysis has created a version of the called procedure that is annotated with re- 
spect to the particular call’s binding-time abstraction p{a{ti), . . . , a(t„)). Before 
we define the actual analysis, we introduce Mercury’s module system and define 
some necessary machinery to base the analysis upon. 

3.1 Mercury’s Module System 

A Mercury program is defined as a set of Mercury modules. The basic mod- 
ule system of Mercury is simple. A module consists of an interface part and 
an implementation part. The interface part contains those type definitions and 
procedure declarations that the module provides (or exports) towards other mod- 
ules. In other words, the types and procedures declared in the interface part of 
a module are visible and can be used (or imported) by other modules. Apart 
from the implementation of the procedures that are declared in the module’s 
interface, its implementation part possibly contains additional type definitions 
and the declaration and implementation of additional procedures. These types 
and procedures are only visible in the implementation part of this module, and 
can not be used by other modules. 
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Note that the way in which the modules import each other impose a hier- 
archy on the modules that constitute a program.^ Following the terminology of 
[36], we use the notation imports{M, M') to indicate that the module M imports 
the interface of M' and imported{M) to denote the set of modules that are im- 
ported by M, that is: imported{M) = {M' \ imports{M, M')}. Figure 1 shows 
an example of a module hierarchy in Mercury in which we graphically represent 
a module by a box, and denote imports{M, M') by an arrow from M towards 
M' . In the example, we have that imported{Mi) = {M 2 , M^}. Note that in 




Fig. 1. A sample module hierarchy. 



Mercury, the imports relation is not transitive; when a module M imports the 
interface of a module M' , it becomes dependent on the interfaces imported by 
M' (and those imported therein) but it does not import these itself. The mod- 
ule system described above is to some extent a simplification of Mercury’s real 
module system, in which modules can be constructed from submodules. While 
submodules do provide extra means to the programmer to control encapsulation 
and visibility of declarations, they do not pose additional conceptual difficulties 
and we do not consider them in the remainder of this work. 

In this work, we aim at developing a binding-time analysis that is as modular 
as possible. Ultimately, a modular analysis deals with each module of a program 
in isolation. We will discuss throughout the text to what extent our binding-time 
analysis is modular in this respect. 

3.2 Mercury Programs for Analysis 

Mercury is an expressive language, in which programs can be composed of pred- 
icates and functions, one can use DCG notation, etc. However, if we consider 

® While in Mercury modules may depend on each other in a circular way, we restrict 
our attention to programs in which no circular dependencies exist between the mod- 
ules. We discuss how one could deal with circular dependencies in Section 6. 
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only programs that are type correct and well-moded - which is natural, since the 
compiler should reject programs that are not [43] - such a program can be trans- 
lated into superhomogeneous form [43] . Translation to superhomogeneous form 
involves a number of analysis and transformation steps. These include trans- 
lating an n-ary function definition into an n -I- 1 ary predicate definition [44], 
making the implicit arguments in DCG-predicate definitions and calls explicit, 
and copying and renaming predicate definitions and calls such that every re- 
maining predicate definition has a single mode declaration associated with it 
[43] that specifies for each argument whether it is an input or output argument. 
As such, every predicate definition is transformed to a set of so-called procedure 
definitions, with one procedure for every mode in which the original predicate is 
used. 

For our analysis purposes, we assume that a Mercury program is given in 
superhomogeneous form. This does not involve any loss of generality, as the 
transformation from a plain Mercury program into superhomogeneous form is 
completely defined and automated [43]. Formally, the syntax of Mercury pro- 
grams in superhomogeneous form can be defined as follows. We use the symbol 
n to refer to the set of procedure symbols underlying the language associated 
to the program. As such, we consider two procedures that are derived from the 
same predicate as having different procedure symbols. 

Definition 8 (superhomogeneous form). 

Proc ::= p{X) : —G. 

Goal := Atom \ not{G) \ (Gi , G 2 ) ] (Gi ; G 2 ) ] ifGi ther ^2 else G 3 

Atom :■= X — YIX ==Y\X ^ /(F) ] AT ^ /(F) ] p(X) 

where p/n € II , X and Y are distinct variables and X is a sequence of n 
distinct variables ofV, f/m G S,Y a sequence ofm distinct variables ofV, and 
G,Gi,G 2 ,G 3 G Goal. 

The definition of a procedure p in superhomogeneous form consists of a single 
clause. The sequence of arguments in the head of the clause, denoted by Args{p), 
are distinct variables, explicit unifications are created for these variables in the 
body goal - denoted by Body{p) - and complex unifications are broken down 
in several simpler ones. The arguments of a procedure p are divided in a set of 
input arguments, denoted by in(p) and a set of output arguments denoted by 
out(p). A goal is either an atom or a number of goals connected by conjunction, 
disjunction, if then else or not. An atom is either a unification or a procedure 
call. Note that, as an effect of mode analysis [43], unifications are categorised as 
follows: 

~ An assignment of the form X := Y. For such a unification, Y is input, 
whereas X is output. 

— A test of the form X == Y . Both X and Y are input to the unification and 
of atomic type. 

~ A deconstruction of the form X f{Y). In this case, X is input to the 
unification whereas M is a sequence of output variables. 
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— A construction of the form X <i= f{Y). In this case X is output from the 
unification whereas K is a sequence of input variables. 

During the translation into superhomogeneous form, unifications between values 
of a complex data type may be transformed into a call to a newly generated 
procedure that (possibly recursively) performs the unification. For any goal G, we 
denote with in(G) and out(G) the set of its input, respectively output variables® 

Example 8. Consider the classical definition of the append/3 predicate, both in 
normal syntax and in superhomogeneous form for the mode append (in, in, out) 
as depicted in Fig. 2. 



append/ 3 


append/3 in superhomogeneous form 


append ( [] , Y , Y) . 


append(X,Y,Z) 


append ( [E I Es] , Y, [E I R] ) : - 


(X^D, Z:=Y ; 


append(Xs , Y,R) . 


X^lElEs], append(Es, Y, R) , Z<^[E|R]). 



Fig. 2. The append/3 predicate and append (in, in, out) in superhomogeneous 
form. 



According to Definition 8, conjunctions and disjunctions are considered binary 
constructs. This differs from their representation inside the Melbourne compiler 
[40], where conjunctions and disjunctions are represented in flattened form. Our 
syntactic definition however facilitates the conceptual handling of these con- 
structs during analysis. 

For analysis purposes, we assume that every subgoal of a procedure body 
is identified by a unique program point, the set of all such program points is 
denoted by Vp. If we are dealing with a particular procedure, we denote with rjQ 
the program point associated with the procedure’s head atom, and with rjb the 
program point associated to its body goal. The set of program points identifying 
the subgoals of a goal G is denoted by T^s{G), this set includes the program 
point identifying G itself. If the particular program point identifying a goal G in 
a procedure’s body is important, we subscribe the goal with its program point, as 
in Grj or explicitly state that Vp{G) = rj. An important use of program points is 
to identify those atoms in the body of a procedure in which a particular variable 
becomes initialised or, said otherwise, those atoms of which the variable is an 
output variable. This information is computed by mode analysis, and we assume 
the availability of a function 



init : V p{Vp) 

Although Mercury has some support for more involved modes - other than input 
versus output - that are necessary to support partially instantiated data structures 
at run-time, release 0.9 of the Mercury implementation [40] does not fully support 
these. 
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with the intended meaning that, for a variable V used in some procedure, if 
init(y) = {r]i, . . . , rjn}, the variable V is an output variable of the atoms iden- 
tified by rji, . . . ,r]n- Note that the function init is implicitly associated with 
a particular procedure, which we do not mention explicitly. When we use the 
function init, it will be clear from the context to what particular procedure it 
is associated. 

Example 9. Let us recall the definition of append/3 in superhomogeneous form 
for the mode append (in, in, out) , with the atoms and structured goals occur- 
ring in the procedure’s definition explicitly identified by subscribing them with 
their respective program point as in Figure 3. We denote the program points 



append(X,Y,Z)o 

((X^[]i, Z:=Y2 )ci ; 

(X=>[E|Es] 3, (append(Es, Y, R) 4 , Z<^ [E I R] s)c2 )c3 )di • 



Fig. 3. append/3 with explicit program points. 



associated to a structured goal by subscripting the goal with the characters ‘c’ 
for conjunction and ‘d’ for disjunction, accompanied by a natural number. From 
mode analysis, it follows that 

init(AT) = {0} init{E) = {3} init(i?) = {4} 

init(y) = {0} init(ifs) = {3} init(Z) = {2,5} 

Or, put otherwise, X and Y (being input arguments) are initialised in the proce- 
dure’s head, E and Es are initialised in the deconstruction identified by program 
point 3, i? is initialised in the recursive call whereas Z is initialised either by 
the assignment Z := Y (program point 2) or by the construction Z [E\R] 
(program point 5). 



3.3 A Modular Analysis 

In order to make the binding-time analysis as modular as possible, we devise 
an analysis that works in two phases. In a first phase, we represent binding- 
times and the relations that exist between them according the data flow in the 
program in a symbolic way. Doing so enables us to perform a large part of the 
data-flow analysis independent of a particular call pattern. It is only in the second 
phase that call patterns in the form of the binding-times of a procedure’s input 
arguments are used — in combination with the symbolic information derived 
from the first phase — for computing the annotations and the actual binding- 
times of the procedure’s other variables. The first phase of the analysis hence 
is call independent whereas the second phase is call dependent. Obviously, the 
call independent phase of the analysis does not need to be repeated in case a 
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procedure is called with a different binding-time characterisation of its arguments 
and consequently, the result of a module’s call independent analysis can be used 
regardless of the context the module is used in, and must not be repeated when 
the module is used in different programs. Since the domain of binding-times is 
condensing [ 21 ], the call-independent analysis preserves the precision that would 
be obtained by a call-dependent analysis. 

To symbolically represent the binding-time of a variable at a particular pro- 
gram point, we introduce the concept of a binding-time variable, the set of which 
is denoted by Vbt- We will denote elements of this set as variables subscribed 
by a program point. If y is a variable occurring in a goal G, and 77 is a pro- 
gram point identifying an atom in G, then the binding-time variable Vrj G Vbt 
symbolically represents the binding-time of V at program point tj. Given a type 
path S € TPath, we use the notation to denote the subvalue identified by 6 
in the binding-time of V at program point 77 ^. 

Example 10. Given the definition of append/3 from Example 9, the binding-time 
variables Xq, Zi, Z 5 and Zq denote, respectively the binding-time of X at the 
program point 0 and the binding-times of Z at the program points 2, 5 and 0. 

Apart from the binding-time variables that correspond with program vari- 
ables, we introduce a number of extra binding-time variables that we use to sym- 
bolically represent some control information that will be collected (and needed) 
during the binding-time analysis. For each program point 77 , we introduce two 
such variables, 7?.,, and C,,, that range over the set of binding-times {_L, T}. Their 
intended meaning is as follows: 

— 7?.^ = T: Either the goal identified by 77 reduces to true or fail during spe- 
cialisation, or its residual code is guaranteed not to fail at run-time. 

— TZrj = T : No claims are made about the outcome of the reduction at special- 
isation time. 

— Crj = T '■ The goal identified by 77 is under dynamic control in the procedure’s 
body. We say that an atom is under dynamic control if the fact whether it 
will be evaluated depends on the success or failure of another goal, say Gt^> 
while success or failure of that goal is undecided at specialisation-time (that 
is TZn' = T). 

— Crj = -L: The goal identified by 77 is not under dynamic control in the proce- 
dure’s body. 

Note that these binding-time variables - which we will refer to as control 
variables - are boolean in the sense that they will only assume a value that is 
either T or T. During the binding-time analysis, these control variables collect 
the necessary information to implement the control strategy of the specialiser. 
Our analysis models a rather conservative specialisation strategy, in the sense 
that during specialisation, no atoms are reduced that are under dynamic control. 

Hence Ey' and E, denote the same binding-time value and we will use the latter in 
examples. 
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The idea behind this strategy is that in this way only atoms are reduced that 
would also be evaluated if the program is executed with a complete input that 
extends the static input for which the program is specialised. 

Indeed, their being evaluated depends only on goals that are - during speciali- 
sation- sufficiently reduced in order to decide success or failure. Hence, no atoms 
are “speculatively” reduced, guaranteeing termination of the reduction process 
(constituting local termination) under the assumption that the equivalent single 
stage computation terminates. 

Example 11. Consider the following code fragment 
if X => [] then p(X) else q(X) 

Both atoms p{X) and q{X) are under dynamic control if Ai’s binding-time 
does not allow the specialiser to decide whether or not the test Ai [] will 
succeed during specialisation. Indeed, the specialiser has no means of knowing 
which of the branches will be taken during the second stage of the computation.® 

In general, the binding-time of a program variable can depend on the binding- 
times of other program variables (according to the data flow) and on the value 
of the appropriate control variables (according to the control strategy). The 
values of the control variables that are associated to a goal in turn depend on 
the binding-times of that goal’s input variables. Symbolically, we can represent 
these dependencies by a number of constraints between the involved binding- 
time variables. In general: 

Definition 9 (binding-time constraint). A binding-time constraint is a con- 
straint of the following form: 



V^>X 



7 

T)' 



y T 

* r) — ' 



v& x'^, 

^ 7 ] — 7 ]' 



yS -p 

* 7j ' 



where Vrj,Xy G Vbt and (5 , 7 G TPath. The set of all binding-time constraints 
is denoted by BTC. 



A constraint of the form V.^ ^ Xf^, denotes that the binding-time represented 
by must be at least as dynamic as (or cover) the binding-time represented 
by X))^,. Note that such a constraint requires the types of V and X, denoted by 
Tv and Tx to be such that Ty and are instances of one another, in order 

® Note that it can happen that the analysis cannot predict the outcome of the test while 
execution of the program with full input always selects the same branch, e.g. q{X). 
Although the call to p{X) is residualised, the code of the procedure p/1 is specialised. 
All reductions performed while specialising p/1 are then in fact speculative (and 
the specialisation could in extreme cases be non-terminating while execution of the 
program to be specialised with full input is always terminating). 
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for their binding-times to be comparable. The intended meaning of a constraint 
of the form X^, is that the binding-time represented by is at least 

as dynamic as the binding-time value associated to the path identified by 7 
in the binding-time represented by X^,. Note that such a constraint does not 
require Ty and rjr to be of comparable types; it simply expresses that if the node 
identified by 7 in the binding-time represented by X^j' is dynamic, so must be 
the node identified by 5 in Td; and by definition of a binding-time, so must be all 
its descendant nodes. Remark that we also allow constraints in which the right- 
hand side is the constant T. Although we occasionally also consider constraints 
of which the right-hand side is the constant _L, we do not explicitly mention 
these in the definition, as these constraints are superfiuous: for any X^ € Vbt 
and 5 G TPath, it holds by definition that X^ ^ _L. 

Example 12. Reconsider the definition of append/3 in Fig. 3. Some examples 
of binding-time constraints between binding-time variables from append/3 and 
their intended meaning are: 



Z2hYo 


The binding-time associated to Z at program point 2 
is at least as dynamic as the binding-time associated 
to Y at program point 0. 


Es h 


The binding-time associated to E at program point 3 
is at least as dynamic as the subvalue denoted by ([|], 1 ) 
of the binding-time associated to X at program point 0. 


^ Es 


The subvalue denoted by ([|], 1) in the binding-time of 
Z at program point 5 is at least as dynamic as the 
binding-time associated to E at program point 3. 


7^3 Xo 


If Xo represents a binding-time in which the 
root node () is bound to dynamic then 

one cannot assume that the atom at program point 3 reduces to 
true, fail or code that is guaranteed to succeed. 


C 4 ^ TZs 


The atom at program point 4 must be under dynamic control 
if the specialisation of the atom at program point 3 possibly 
results in residual code that might fail at run-time. 



A set of binding-time constraints is called a binding-time constraint system 
(or simply a constraint system). Given a constraint system C, we define vars(C) 
as the set of all binding-time variables that occur in some constraint C G C. 
The link between a binding-time constraint system and the actual binding-times 
it represents is formalised as a (minimal) solution to the constraint system. 

Definition 10 (solution). A solution to a hinding-time constraint system C is 
a substitution a : Vbt '— *■ BT mapping hinding-time variables to binding-times 
with dom{a) = vars{C) such that 

— for every constraint V.^ G C and V.^ T G C it holds that <j{VriY 'r T 

— for every constraint >: X^, G C it holds that cr{VriY >1 o’(A^/)'>' 

~ for every constraint V.^ >* Xf^, G C it holds that (t(A,,/)( 7 ) = dynamic 
h T 
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Given two solutions a and a' to C, we define that a^a' if for all Vrj G dom{a') 
it holds that Vjj G dom{a) and cr{Vjj) >z A solution a is a least solution 

for C if for every solution <j' for C it holds that a' □ a. 

Remember, a solution must also satisfy the condition of Definition 5, i.e. if 
= dynamic then also cr(Al^' )'>'■“ = dynamic for any extension a. We will 
sometimes use a constraint of the form U (analogously for ^*) 

as shorthand notation for the set of constraints {V^ Y Xf ^, , }. Indeed, 

from Definition 10 it can be seen that in any solution a satisfying the latter two 
constraints, it holds that cr{VriY h cr{X^, ) U cr(Yf^„ ), where U denotes the least 
upper bound on 

Example 13. Consider the following binding-time constraint system and its least 
solution. For sake of simplicity, we assume that all binding-time variables are 
boolean and range over the set {dynamic^ static}. 



Binding-time constraint system 


Least solution 


Xm YT 




Rr/3 ^ ^r)2 


J {Xri^, dynamic) {X^j.^, static) \ 


y^Tj4 ^ 


} , static) {Yrj^ , dynamic) j 


Y y n 





In what follows, we formulate our analysis as a call-independent abstract 
semantics. We define the abstract “meaning” of a goal, be it an atom or a 
structured goal, as a set of binding-time constraints (description domain p{BTC)) 
that reflect the data flow between the input- and output arguments of the goal. 
An essential operator for the symbolic data flow analysis is a projection operator 
that basically rewrites a set of constraints such that every constraint expresses 
(or constrains) the binding-time of a local variable within a procedure in function 
of the binding-time (s) of that procedure’s input arguments. Such a constraint is 
said to be in normal form: 

Definition 11 (normal form). A binding-time constraint is in normal form 
with respect to a procedure p G Proc if it is either of the form 

~ Kf ^ T 

— V.^ Y X}}^ with X G in(p) and rjo the program point associated to p ’s head 
atom. 

and analogously for constraints of this form using Y*. 

Example If.. Reconsider the binding-time constraints from Example 12. The 
constraints 

^2 YYo Esh 7^3 Xo 

are in normal form with respect to append/3, whereas the constraints 

YEs Cih 7^3 



are not. 
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Projection of a constraint involves unfolding the (subvalue of the) binding- 
time variable in its right-hand side with respect to a single constraint on (a 
subvalue of) this variable. If we consider two subvalues of a binding-time variable, 
say and X^, one of them is a subvalue of the other if either S is an extension 
of 7 or vice versa. This is captured by the following definition: 

Definition 12 (extension). We define ext: TPathx TPath^ TPathx TPath 
as follows: 

f (0,a) ifj = S.a 

ext{-f, 5 ) = I {a,{)) ifj.a = S 
[ undefined otherwise 

Note that if ext(7, S) = {a, a') then 7. a = S.a'. Unfolding a constraint Xfj ^ Y^, 
with respect to another constraint result in a new constraint on (a subvalue of) 
Xi^, with as right hand side the appropriate subvalue of the right hand side of 
the constraint that was used for unfolding. To denote a subvalue of a constraint’s 
right hand side (j) (which is either a binding-time variable or one of the constants 
T or T), we use the notation (/) “. If (f denotes a variable Xf!^, then equals 

Otherwise, if (f denotes one of the constants T or T, ^ “ simply equals 
(f. Note the use of the least element of the equivalence class, [7.7], to denote an 
element of the appropriate type graph £= (rather than the type tree Ct)- The 
projection operation is defined in Definition 13 and basically consists of a fixed 
point iteration over an unfolding operator followed by a selection operation that 
retrieves the constraints of interest from the fixed point. Recall that 70 identifies 
the head atom of the procedure of interest. 

Definition 13 (projection). The projection of a set S C p{BTC) on a set of 
binding-time variables V C Vbt is denoted by projyS and defined as 

projv(S) = G Ifpiunfs) \ X eV} 

where unfg is defined in Figure 4 - 

The symbolic analysis is defined in Definition 14. The result of analysing a 
program is a mapping (from the semantic domain Den) that maps a procedure 
symbol p to a set of binding-time constraints on the variables that occur in the 
definition of the procedure p. The constraints are in normal form. Poly variance is 
immediate, since all constraints are expressed in terms of the procedure’s input 
arguments, which are represented symbolically and hence can be instantiated 
by any call pattern. The analysis is defined by a number of semantic functions 
defining the abstract semantics of a program P : Prog 1— > Den in terms of the 
semantics of the individual procedures, goals and atoms. 

Definition 14 (call independent abstract semantics). The call indepen- 
dent abstract semantics for description domain p{BTC) has semantic domain 



Den : II i-^- p{BTC) 
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unfs : p{BTC) ^ p{BTC) 

unfs(/) = |c 

where 

51 = ^ I x;} ^ Y^, e S, Y^, Y4>al, and ext(5, 5') = (a, a')} 

5 2 = {XT' h* <i>\ XJ, YY^, £ S and Y^, 0 e 7} 

5 3 = {Xr, Y* cj,~ I Xr, Y* Y^, € S, Y^^: Y(j)£l a.nd ext(^, 5') = ((), a)} 

Fig. 4. The projection projy 



C e S U Si U S 2 U S 3 and the form of C is 
eitherX^^(*^F^o or 



and semantic functions 

P : Prog e-> Den 
C : Proc Den Den 
G : Goal Den p(BTC) 

A : Atom 1 — > Den p{BTC) 
and is defined in Figures 5 and 6. 

The result of analysing a program is a denotation, P|P], in the domain Den, 
which is a mapping from a predicate symbol to a set of binding-time constraints. 
This mapping is defined as the least fixed point of applying the analysis function 
C to each individual procedure. The analysis function C constructs a partial de- 
notation for a particular procedure, given a (possibly incomplete) denotation 
that represents the result of analysis of the whole program so far. The analysis 
functions G and A map respectively a structured goal and an atomic goal to 
a set of binding-time constraints, given a denotation - again representing the 
result of analysing the whole program so far. In general, the result of analysing a 
complex goal is the union of the constraints obtained by analysing each subgoal 
in isolation, together with a number of additional constraints on the control vari- 
ables associated with the goal and its subgoals. These constraints are simple, as 
they merely reflect the propagation of the control variable’s value, either from 
the goal to its subgoals (in case of the control variable C) or from the goal’s 
subgoals to the goal itself (in case of TZ). The binding-time variables denoting 
dynamic control denote that a goal is under dynamic control with respect to 
the procedure’s body. The negated goal (G) in a negation is under dynamic con- 
trol only if the negation (^G) itself is. Observe that if A reduces to true or is 
guaranteed to succeed, then not(A) fails. And if A fails then not(A) succeeds. 
So we can say that the negation reduces to true, fail, or residual code which is 
guaranteed to succeed if the negated goal does. The propagation in the other 
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PI^’l = ifp( u ‘^bl) 

p^Proc(P) 

Clp(X)^Gr,jd={{p,GlG4d)} 

Gl{G'^,,G”„)4d = G|G;,]dUG[G;'„ldUC7Qonjb>b,b') 

Glnot^(G^,)jd = G|Gv]dUGGnotb,b) 

GI*4 G;, thenG”,, elseG'”,,,} = G|G;,]d U GIG;'„ld U G|G;';„]d 

UGGjf(??,77',r;",7?"') 

G[(G;,;G;'„)^ld = G|G;,]dUGIG;'„ldUGGdisj(»?,r;',»?") 

Gl^ljd = A|yl^]d 

U {X^ > X^i I X G in(yl), 77' G reach(X, 77)} 



A|X ==^ y]d = {7^^ y* u y,} 

A|X Yjd = {X^ y y u y , 7^^ ^ X} 



A|X /(y)id = Uy,.g-{y, X u y } u {7^^ x* xy 

A|X f{Y)jd = ^ Px U y } U {7^^ X X} 

A|p(Xi, . . . ,X„)^]d = p(proj^^^g(p) U {Xi, X C„ | Xi G out(p)} 



where Args{p) denotes the sequence of formal arguments in the definition of p/n, 775 is 
associated to the body goal in the dehnition oi pin and p is a renaming mapping the 
sequence of formal arguments Args{p) to the sequence of actual arguments (Xi , . . . , X„) 
and TZri^ to TZn- 



Fig. 5. The call independent abstract semantics 



j Cn' X y y" X y y" x Tir,' 

\ 'R-v X R-t]' R-v X R-tj" 

Crj' X y Cri" Y Cn I 
t R-v X R-v' R-v X R-n" J 
GGjjot(^i V ) ~ { ^v' X y R-v X R-v ' } 



GGconjb>b,b') = 



GGdisj(h,b,h") = 



GGjf(?7,77',77",?7" 



Cn' X y Cn" X y Cn"! X y 

' Cn" X R-n' Cn'" X R-v' ' 
, 7^77 4 IZrj 4 Ttf] 4 IZrjfff 



Fig. 6. The call independent abstract semantics (ctd.) 
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constructs is similar: the subgoals of an if-then-else are under dynamic control if 
the if-then-else is under dynamic control. Moreover, both the then and else goals 
are under dynamic control if the test goal possibly reduces to residual code which 
could fail at run time. If each of the if-then-else’s subgoals reduces to true, fail 
or code that is guaranteed to succeed, so does the if-then-else. The subgoals of a 
conjunction are under dynamic control if the conjunction itself is. Moreover, the 
second conjunct is under dynamic control if the first conjunct possibly reduces 
to residual code that could fail. If both conjuncts reduce to true, fail or code that 
is guaranteed to succeed, so does the conjunction. To conclude, if a disjunction 
is under dynamic control, so are both disjuncts. If both disjuncts reduce to true, 
fail or code that is guaranteed to succeed, so does the disjunction. 

Example 15. Reconsider the definition of append/3 in Figure 3. The body goal 
contains the following structured subgoals: a conjunction identified by program 
point Cl with the atomic conjuncts identified by program points 1 and 2 , a second 
conjunction identified by C 2 with the atomic conjuncts identified by program 
points 4 and 5, a third conjunction identified by C 3 with the conjuncts identified 
by program points 3 and C 2 and a disjunction identified by program point d\ 
with the disjuncts identified by ci and C3. The binding-time constraints that are 
associated to each of these structured goals are as follows: 



(Cl) 


Cl ^ Cci 7?-ci ^ 7^1 
C 2 ^ Cci TZci h E-2 
C 2 h 7^l 


(C2) 


C 4 ^ Cc 2 7^C2 ^ 7?-4 
C 5 ^ Cc 2 TZc 2 h 7^5 
C 5 ^ E-4 


(C3) 


C 3 ^ Cc3 Tic3 h E-3 

Cc2 ^ Cc3 Tics h E-C2 
Cc2 ^ 7?-3 


(rfl) 


Cci ^ Cdj TZdi >: Eci 
Cc3 ^ Cdj TZdi h Ec3 



The binding-time constraints that are associated to an atomic goal are some- 
what more involved. Apart from binding-time constraints on the atom’s output 
variables, analysing an atom also possibly results in a binding-time constraint 
on the control variable TZri, indicating under what conditions the atom can be 
reduced to true, fail, or code that is guaranteed to succeed. Moreover, when cre- 
ating the binding-time constraints on the atom’s output variables, the control 
variable Cr) must be taken into account, in order to guarantee that the particular 
binding-time is made T in case the atom is under dynamic control. 

Note that in the definition of A the binding-time variables that refer to the 
input variables of an atom at program point 77 are indexed by the program point 
rj. Consequently, a number of additional constraints must be created for each 
atom, relating the binding-time of such an input argument at program point 
rj with its binding-time at the program point (s) where the binding-time was 
created, being output of some other atom. 
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A test does not have any output variables, so it only creates constraints 
on control variables. The atom reduces to true, fail or code that is guaranteed 
to succeed when both input variables are bound to an outermost functor. An 
assignment X := V introduces the constraints specifying that the binding-time 
of X at program point 77 must be at least as dynamic as the binding-time of V 
at program point 77 . Recall that the latter’s value is constrained to be at least 
as dynamic as the least upper bound of the binding-times of Y at the reachable 
program points where Y is assigned a value. Moreover, if the assignment is under 
dynamic control, Xy must be assigned the value T. This is guaranteed by adding 
U Cy to the right-hand side of the constraint on Xy. Even if an assignment is 
not reduced, it can never fail at run time. Hence the (superfluous) constraint 
TZy ^ T. A deconstruction introduces some binding-time constraints indicating 
that the binding-time of the newly introduced variables must be at least as 
dynamic as the corresponding subvalue in the binding-time of the variable that is 
deconstructed. Also in this case, the least upper bound with Cy guarantees that, 
if the deconstruction is under dynamic control, the newly introduced binding- 
time variables will be forced to have the value T. If the deconstructed variable 
is bound to at least an outermost functor, the deconstruction reduces to true 
or fail at specialisation time. Otherwise, a residualised deconstruction can either 
succeed or fail at run time which is reflected by the fact that in that case TZy 
will have the value T. When handling a construction on the other hand, the 
binding-time of the constructed variable is constrained by the binding-times 
of the variables used in the construction. Again, if the construction is under 
dynamic control, the constructed binding-time is guaranteed to be T by the use 
of the least upper bound with Cy. Even when residualised, a construction can 
never fail, so again the (superfluous) constraint 7Zy ^ T is introduced. 

Example 16. Reconsider the definition of append/3 in Figure 3. The constraints 
that are associated to the unifications in append/3’s body goal are as follows. 
The numbers in the left hand side column denote the particular unification’s 
program point. 



(1) 


7^l Ao 


(2) 


TZ-2 Y T Z 2 Y Yq 


(3) 




(5) 


n,YY ^ 

^ i?4 



Finally, handling a procedure p{Xi, . . . , X„) call involves retrieving the con- 
straints for the called procedure p from the denotation and projecting these 
onto the set of variables Args{p) U {TZy,^}. This projection operation makes sure 
that the constraints on these variables are in normal form, i.e. that they are 
expressed in terms of in(p). The resulting set of constraints is then renamed 
to the context of the call. The formal arguments of p, Args{p) are renamed to 
their corresponding actual argument in (Ai,...,X„). The constraints on TZy^ 
are renamed to constraint on TZy, expressing that the call reduces to true, fail or 
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code that is guaranteed to succeed if the body of the called procedure reduces 
to true, fail or code that is guaranteed to succeed. 

Example 17. Let P denote the program consisting only of the definition of 
append/3 depicted in Figure 3 and let (1) and (2) denote, respectively, the 
sets of constraints depicted in Examples 15 and 16. The fixed point computation 
for P|P] starts with an empty denotation and hence, in the first round of the 
computation, the recursive call does not introduce any constraints; the result 
of C|append/3]{} is a denotation that maps append/3 to the constraint set 
(1) U (2). It is only in the second round, when the constraints are projected and 
renamed, that the recursive call adds the constraints 

R4 h Yo ^ TZ4 h* Eso 

One can verify that in a next round no new constraints are introduced by the 
recursive call, and hence P|P] results in a denotation that associates append/3 
to the union of the constraints derived above with the sets (1) and (2). 

3.4 From Constraints to Annotations 

Once we have computed P|P], it suffices to have a set of binding-times for the 
input variables of a procedure p in order to compute the binding-times of the 
remaining variables in the definition of p, as well as the annotations that are 
associated with a particular atom in the definition of p. Let us first introduce 
the semantic domain Call, that we use to represent a call in the domain of 
binding-times: 

Call = {p{Pi, ■ ■ ■ , Pn) I p/n € n and Wi : Pi G BT'^} 

To ease notation, we assume that such a call contains a binding-time for each 
argument (input as well as output). However, since these calls are used to rep- 
resent the binding-times of the input arguments of the call only, we asume the 
binding-times of the output arguments to be T. We will denote elements of Call 
by a single greek letter tt if the particular procedure/ argument combination is 
irrelevant. We can now define the annotation of a procedure with respect to a 
particular call as follows: 

Definition 15 (procedure annotation). Given a denotation d G Den for 

a program P and a call p{P\, . . . , Pn) G Call, the procedure annotation (of a 
procedure p G Proc{P)) induced by a call p{Pi, ■ ■ ■ , Pn) is defined as the least 
solution a of (dp) in which a{Xi) = Pi for every Xi G in(p). 

Being a solution of the set of binding-time constraints associated to a proce- 
dure p, a procedure annotation not only provides binding-times for all program 
variables in p, but also maps every binding-time variable of the form C,, to either 
T or T, denoting respectively that the goal at program point rj in the proce- 
dure’s body should be evaluated during specialisation, or be residualised. Being a 
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least solution, a procedure annotation contains the least dynamic binding-times 
while still satisfying the congruence relation. As such, a procedure annotation 
of a procedure p with respect to a call tt represents control information for a 
specialiser as to how to treat each subgoal of the body of p, when a call to p is 
approximated by tt. 

A polyvariant analysis for a program P and an initial call p{Pi, . • . , /3n) can 
then be performed by first computing the procedure annotation a oi p induced 
by p(/3i, . . . , Pn) and consecutively computing, for every call q{Xi , . . . , Xm) that 
occurs at some program point rj in the definition of p, the procedure annotation 
of q induced by q{a{Xi^), . . . , a{Xm,,))- This process is repeated recursively until 
no more abstract calls are encountered for which no procedure annotation has 
been constructed yet. In other words, a polyvariant annotation process for a 
program P with initial call tt boils down to computing the abstract callset of 
(P, tt): The set of abstractions of all calls that can possibly be encountered during 
evaluation of P with respect to a call that is abstracted by tt. Formally, we define 
also this annotation process by a number of semantic functions that define the 
meaning of a program P with respect to an initial call tt as a set of calls in the 
domain of binding-times. 

Definition 16 (annotation semantics). The first-order annotation seman- 
tics has semantic domain DeUc '■ p{Call) and semantic functions 

Pc : Prog i-^- Call i— > DeUc 

Cc : Proc 1 -^- Call i— > DeUc Dene 

Gc : Coal i-^- Call i— > DeUc 



defined in Figure 1. 



Pe[Ph = lfp( y Cebh) 

pGProc(P) 

C4p(Xi,...,X„)^ B}ttS= y Ge[Bb(di,...,/?») 

p(/ 3 l.-.-,/ 3 n) 6 S'U{ 7 r} 

GcInOt(G)]7T = GcIG]7T 

Ge[Gl,G2l7T = GeIGl]7rUGc[G2l7T 

Ge[Gi;G2l7T = GeIGl]7rUGc[G2l7T 

Gc[i/Gi thenC 2 else Gab = GcIGi]7t U GcIGzItt U GcIGsItt 

Ge[<7(yi,...,y„)b = {q{a4Y4,...,a4Y4)} 

and Gc|A]7t = 0 for any other atomic goal A and where o-tt denotes the procedure 
annotation induced by tt G Call. 

Fig. 7. The annotation semantics 
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The definition of the semantic functions Pc, Cc and Gc is straightforward. 
The semantic domain Deric = p{Call) represents the set of all abstract callsets. 
The semantics of a program P with respect to an initial call tt is defined as 
the least fixed point of repeatedly computing the semantics of each procedure 
(by Cc) in P within the context of this initial call and a (possibly incomplete) 
denotation containing the result of analysis so far. The analysis function Cc 
constructs a partial denotation for a particular procedure as the union of the 
denotations obtained by analysing the procedure’s body goal with respect to 
every call to the procedure encountered so far. The semantics of an individual 
goal G in the body of a procedure p is defined with respect to a call tt to p. The 
definition of Gc is straightforward, as it only collects the abstract calls encoun- 
tered in the annotation of p induced by tt. Note that the analysis is guaranteed 
to create a finite number of procedure annotations since every procedure has a 
finite number of arguments, every such argument can only be approximated by 
a finite number of binding-times, and hence only a finite number of call patterns 
can be constructed for a particular procedure. 

3.5 On the Modularity of the Approach 

In summary, the binding-time analysis we have developed so far is to be per- 
formed in two phases. The first phase of the process performs the data flow 
analysis in a symbolic way. A procedure is analysed independent of a particular 
call pattern, and the analysis handles procedure calls by projecting and renam- 
ing the constraints that are associated to the called procedure. For a program 
that is divided into several modules, this means that the constraint generating 
phase of the analysis can be performed one module at a time, bottom-up in the 
module hierarchy if we consider hierarchies without circularities. Reconsider the 
module hierarchy from Fig. 1 . The result of bottom-up analysis of this hierarchy 
is depicted in Fig. 8. First, the modules at the bottom level, M4 and M5 are 
analysed. Since these modules do not import any other modules, they can be 
treated as regular programs, and we can simply compute PIM4] and PIM5]. The 
rounded boxes in the figure denote the result of computing P|M] for a particular 
module M. The shaded part of the box represent this denotation, restricted to 
the procedures from the module’s interface. Subsequently, the modules M2 and 
M3 can be analysed, since their analysis only requires the constraints from the 
interface procedures of M 4 , respectively M 4 and M 5 . Computation of PIM2] and 
PIM3] can proceed as before, with the exception that the fixed point computa- 
tion should not be started from the empty denotation, but rather from PIM4] 
and PIM4] U PIM5] respectively. Finally, once the results of analysing M2, M3 
and M5 are available, the module Mi can be analysed. Note that in this pro- 
cess, each module is analysed only once. If a module, like M5 in the example, is 
imported in more than one module, analysing the latter modules only requires 
the result of analysing the former. 

The second phase of the analysis, computing the procedure annotations, is 
naturally a call-dependent process. Consequently, annotating a multi-module 
program for an initial call to a procedure p in the top-level module requires the 
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constraints for all the procedures (spread out over all modules) that are in the call 
graph for p. One could argue that this corresponds to analysing a multi-module 
program as if it was a single-module monolithic program. However, it should be 
noted that computing a procedure annotation induced by a particular call is a 
rather cheap process. Since the involved constraints are in normal form, it merely 
consists of performing a substitution on the right-hand side of the constraints 
and computing their least upper bounds. The hard part of the analysis - tracing 
the data flow between the input- and output arguments of a procedure - which 
possibly involves procedure calls over module boundaries, is done at the symbolic 
level, in a modular fashion. 

4 Higher-Order Binding-Time Analysis 

Mercury is a higher-order language in which closures can be created, passed 
as arguments of predicate calls, and in turn be called themselves. To describe 
the higher-order features of the language, it suffices to extend the definition of 
superhomogeneous form (see Definition 8) with two new kinds of atoms: 

— A higher-order unifieation which is of the form X <J= p(Vi, . . . , Vk) where 
X, Vi, . . . , Vfc G V and p/n G II with k < n. 

— A higher-order call which is of the form X(Vfc+i, . . . , t^) where X and 
Vfc+i,. . . , G V with 0 < k <n. 

A higher-order unification X p(Vi, . . . , 14) constructs a closure from an n- 
arity procedure p by currying the first k arguments (with k < n). The result 
of the construction is assigned to the variable X and denotes a procedure of 
arity n — k. Such a closure can be called by a higher-order call of the form 
X{Vk+i , . . . , Vn) where 14+i, . . . , 14 are the n — k remaining arguments. The 
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effect of evaluating the conjunction X p{Vi, . . . , Vk),X{Vk+i, ■ ■ ■ , Vn) equals 
the effect of evaluating p{V \, . . . , Vn)-^ 

In order to represent higher-order types it suffices to add a special type con- 
structor, pred, to Er- This constructor is special in the sense that it can be 
used with any arity and it has no type rule associated with it. Consequently, a 
higher-order type corresponds with a leaf node in a type tree. In what follows we 
represent higher-order types as pred{t\, . . . ,tk) with . . . ,tk first-order types. 
We furthermore assume that higher-order types are not used in the definition of 
other types; that is, values of higher-order type are only constructed, called, or 
passed around as arguments of a procedure call.^° 

The basic problem when analysing a procedure involving higher-order calls, 
is that the control flow in the procedure is determined by the values of the 
higher-order variables. To retrieve a set of suitable binding-time constraints be- 
tween the in- and output arguments of a higher-order call X{Yk+\, . . . , ¥„), it is 
necessary to know to some extent to what closures X can be bound to during 
specialisation. Consequently, to achieve an acceptable level of precision, the sym- 
bolic data flow analysis needs to be enhanced by some form of closure analysis 
[23,35] which basically computes for every higher-order call an approximation of 
the closures that may be bound to the higher-order variable involved. In what 
follows, we first define a suitable representation for such closure information; 
next we reformulate the first phase of our binding-time analysis in such a way 
that it integrates the derivation of closure information with the derivation of 
binding-time constraint systems. Doing so basically transforms the process of 
building constraint systems into a call dependent process, since closures can be 
passed around by procedure calls and hence the analysis needs to take the clo- 
sure information from a particular call pattern into account. We conclude this 
section with a discussion on the modularity of the higher-order approach. 



4.1 Representing Closures 

In order to use closures during binding-time analysis, where concrete values of 
the closure’s curried arguments are approximated by binding-times, we introduce 
the notion of a binding-time closure as follows. 

Definition 17 (binding-time closure). A binding-time closure is a term of 
the form p{Pi, . . . , Pk) where p/n € II , k < n and P\, . . . ,Pk G . The set of 
all such binding-time closures is denoted by Clos. 

® When writing Mercury code, the programmer can also use lambda expressions to 
construct closures. These can, however, be converted into a regular procedure def- 
inition which is then again used to construct the closure as above. The Melbourne 
Mercury compiler does this conversion as part of the translation into superhomoge- 
neous form. Note that closures cannot be constructed from other closures: once a 
closure is created, one can only call it or pass it as an argument to another procedure. 
In fact, this is also a limitation of release 0.9 of the Mercury implementation [40]. 




218 Wim Vanhoof, Maurice Bruynooghe, and Michael Leuschel 



If p/n G 77, p(/3i, . . . , (3k) approximates a set of procedures of arity n — k, each 
being an instance of p in which the first k arguments are fixed and whose values 
are approximated by the binding-times (3\, ... ,(3k- 

Example 18. Given the traditional append/3 procedure and Pi being a binding- 
time approximating terms of type list (T) that are instantiated at least up to a 
list skeleton, append, append(Pi) and append{E, Pi) are examples of binding-time 
closures of arity 3, 2 and 1 respectively. 

In order to obtain a precise binding-time analysis, we approximate the value 
of a higher-order variable with a set of binding-time closures. A singleton set 
{c} describes that the higher-order variable under consideration is, during spe- 
cialisation, definitely bound to a closure that is approximated by c. In general, 
a set {ci,...,c„} describes that the higher-order variable under consideration 
is bound during specialisation to a closure that is approximated either by ci, 
C 2 ,. . . , or c„. To make this representation explicit, we alter the definition of the 
domain B. Instead of containing only the values static and dynamic, we now 
include a value static{S) with S being a set of binding-time closures. Note that, 
if we define dynamic > static as before and static(Si) > static{S 2 ) if and only 
if 5'i S 2 , B is still partially ordered. Since the binding-times now include 

higher-order binding-times, we alter the definition of the partial order relation 
on BT: 

Definition 18 (covers). Let P,P' G BT such that dom{P) C dom{P') or 
dom{P') C dom{P). We say that P covers P' , denoted by P P' if and only 
if it holds for all 6 G dom{P) n dom{P') that: 

— P'{5) = dynamic implies P{S) = dynamic, and 

— P'{5) = static{S') implies p(S) = static{S) and S (f S' . 

Note that, with this new definition, the covers relation remains only defined 
between two binding-times that are derived from types that are instances of 
each other. In case of higher-order binding-times this means that both sets of 
binding-time closures contain closures of identical arity and argument types. 
Like before, we denote with the set BT U {T,_L}, and (BT^,>i) forms a 
complete lattice. 

4.2 Higher-Order Binding-Time Analysis 

We now reformulate the analysis from Section 3 such that it takes the higher- 
order constructs of Mercury into account. As a first observation, note that 
the binding-time constraints that are associated to first-order unifications and 
structured goals (see Figures 5 and 6) remain unchanged in the context of a 
higher-order analysis. To deal with higher-order constructions, we add an ex- 
tra form of binding-time constraint to BTC; namely a constraint of the form 
Aj; ^ p(Ai, . . . , Xk). The intended meaning is that the (higher-order) binding- 
time associated to X at program point rj should at least contain a closure con- 
structed from p and the binding-times of its arguments at program point rj. 
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Formally, we extend the definition of a solution (Definition 10) such that for 
every constraint of the form ^ ■ • ■ ? it holds that 

h static{{p{(3i, . . . , Pk)}) where Pi ^ a{Xi^) for 1 < i < k. 

The main difference with the symbolic data flow analysis of Section 3 in a 
higher-order setting is that a set of constraints can no longer be associated to 
a procedure symbol (as in the semantic domain Den) because a typical higher 
order predicate is passed a procedure as one of its input arguments (e.g., a call 
to map has as one of its inputs the predicate to be applied on the elements of the 
list it has to process) and the resulting set of binding time constraints depends 
on the input predicate. Instead, in the higher-order analysis, we associate a set of 
binding-time constraints with a particular abstract call. Therefore, we define the 
analysis as an abstract semantics as before, but over the new semantic domain 

DeUcc ■ Call i-^- piBTC). 

The notion of a procedure annotation of a procedure p induced by a call 
p{Pi, . . . , Pn) is straightforwardly adapted for use with a denotation in DeUcc 
rather than in Den. Moreover, given two such mappings f^gG DeUcc, we define 
/ U g as a mapping in DeUcc with dom{f U g) = dom{f) U dom{g) and 

{ f{x) U g{x) if a; G dom{f) n dom{g) 
f{x) if a: G dom{f) and x ^ dom{g) 

g{x) if a; G dom{g) and x ^ dom{f) 

The resulting analysis is a call-dependent analysis that is basically a combination 
of the call-independent and call-dependent analyses of Section 3. 

Definition 19 (higher-order semantics). 

The higher-order semantics has semantic domain 

Deucc ■ Call i-^- p{BTC) 



and semantic functions 



Pec : Prog 1 -^- Call i— > DeUcc 

Ccc : Proc 1 -^- Call i— > DeUcc DeUcc 

Gee : Goal i-^- Call i— > DeUcc DeUcc 

Acc : Atom i-^- Call i— > DeUcc DeUcc 

defined in Figure 9. 

Again, the meaning of a program is defined as a fixed point computation over 
the meaning of the individual procedures in the program given a binding-time 
abstraction of the call with respect to which the program must be specialised. 
Each procedure is analysed (by Ccc) within the context of this initial call and a 
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Pcc[Ph = lfp( U CecblTT) 
p^Proc(P) 

CocIp{Xi,...,X„) Bjnd^ y Gec[B]p(/3i,...,/3„)d 

p{ 0 l , . . . , 0 n)^doTn(d)\j{Tr} 



Gec[(G;,,G"„).,l^d = 

GecIG;,]7rd U Gec[G"„]7rd U {(tt, GG^onj iv, v' , v”))} 
Gcc|not,,(Gy)]7rd = 

Gec[Gv]7rdU{(7r,GGnotb,b))} 

Gcc[i/„G;, thenG”,, elsevG'”,„\i^d = 

Gec[G;,]7Td U Gec[G"„]7Td U Gec[G";„]7rd U {b, GGif(r;, b, b', b"))} 

Gec[(G;,;G"„)^l^d = 

Gec[G;,]7rd U Gec[G"„]7rd U {(tt, GGdisj b, b, 77 "))} 

Gcc|^,,]7rd = 

Acc|M^]7rd U {(tt, S)} 

where S = {Xr, > X^i \ X G in{A),ri' G reach(X, 77 )})} 



Acc|G]7rd = {(tt, A|G]d)} for a first-order unihcation U 

Aec[A ^ p(Xl, . . . , XkUnd = {(tt, {Xr, h p(Al, . ..,Xr,),TZr,h ±})} 

Acc|<?(yi, • • • , y„),,]7rd = 510 52 where 

51 = {(g(/3i,...,/3n),{})} 

52 = {{tt, p{pTO)_^rgs(q),nr,^id diPi, ■ ■ ■ , fdn))) U {Yi^ YCr,\YiG out(< 7 )})} 

with Pi = cr^(yi,) 

Acc|A(Yfc+i, . . . ,y„),,]7rd = 5i U 52 where 

51 =if a^(X^) = T 

then {(^(T, . . . , T), {}) | q/m G Proc(P) and m> n — k} 
else {{q{Pi, . . -,Pn), {}) | q{Pi, ■■■,Pk) G 5} 

where cr^(X,,) = static{S) and Pi = o-^(y) for fc -I- 1 < i < n 

5 2 = {b,U,(/3i,...,/3„)edom(Si)P(P>'0jv(4'?A= ••>/?«)))□ 

{Yin hCr,\Yi G out(( 7 )})} where V = Args{q) U {7?.,,^} 



Fig. 9. The higher-order semantics 



denotation (in Dericc) representing the (possibly incomplete) results of analysis 
so far. The definition of Gcc, defining the abstract meaning of a goal, is basically 
identical to the definition of G from Section 3, apart from the facts that (1) it 
threads a denotation as well as the abstract call to the procedure that is cur- 
rently being analysed and (2) it associates this abstract call to the constraints for 
a particular goal. The same observations hold for the definition of Acc. The con- 
straints derived for a first-order unification are identical to those derived by A. 
A higher-order construction results in a constraint stating that the binding-time 
of the higher-order variable must contain at least the abstract closure created at 
this program point. Note that we propagate the binding even when the construc- 
tion is under dynamic control, as this binding allows to substantially simplify the 
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analysis of higher-order calls. Being a construction, reduction can never result 
in code that might fail during exeuction, hence the (superfluous) constraint on 
TZr,. 

Handling procedure calls is somewhat more involved than in the first-order 
case. Retrieving the constraints associated to a first-order call from the denota- 
tion now requires to compute the binding-times of the arguments in of the call. 
As before, represents the procedure annotation induced by the call tt. The 
binding-time variables in the resulting (projected) constraints are again renamed 
to the actual arguments of the call Xi, . . . , Xn and the control variable is 
renamed to TZri, as before. As for the other goals, the resulting constraints are 
associated to the abstract call tt for which the surrounding procedure is being 
analysed. The resulting mapping, in Figure 9 denoted by S 2 , is updated with the 
mapping . . . , Pn), {})} in order to make sure that the call q{Pi, . . . , Pn) is 

in the domain of the newly constructed denotation, and hence will be analysed 
during a next round of the analysis. Note that the use of U guarantees that if the 
call was already in the domain of the donation, the set of constraints associated 
to it remains unchanged. A higher-order call is basically handled as a set of first- 
order calls. First, the binding-time of the higher-order variable is retrieved from 
the procedure annotation (T,r for the currently analysed procedure/call combi- 
nation. If this binding-time equals static(S), each closure q{P\,...,Pk) G S' is 
transformed to a first-order call by adding ct.^. (ATfc+i ),..., CTt(AT„) to its argu- 
ments. From then on, the call is handled as a first-order call. The constraints 
associated to this call are retrieved from the denotation and added to the de- 
notation under construction, and the call itself is added to the domain of the 
denotation under construction. 



4.3 On the Modularity of the Approach 

In a higher-order setting, the constraint generation phase of our binding-time 
analysis is a call dependent process. Indeed, the data flow dependencies in a pro- 
cedure are determined by the closures contained in the procedure’s call pattern. 
This suggests that the advantage of modularity, associated to the constraint 
based technique in a first-order setting, might no longer hold in a higher-order 
setting. However, to some extent the analysis can still be performed in a bottom- 
up, modular way. For a module M that exports the predicates pi,...,p„ we 
initiate the analysis with: 



U PeclPMT,...,T). 

pe{pi,...,p„} 

At first sight, it might seem strange to perform a call-dependent analysis with 
respect to an inital call in which all arguments are approximated by T. However, 
recall that only the higher-order parts of the call patterns influence the resulting 
constraint systems. Hence, for those procedures that have no higher-order argu- 
ments, the constraint system derived by the call dependent analysis for a call 
p(T, . . . , T) equals the one derived by the call independent analysis of Section 3, 
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and it can readily be used by other modules importing these procedures. Note 
that the call dependent nature of the process ensures that closure information 
that is constructed in a module M, is propagated inside M itself. It is only if 
closure information is “lost” over a module boundary that the resulting analysis 
is less precise than a full call dependent analysis over the complete multi-module 
program. This is the case when, in some module, closure information is available 
in some arguments of a call to an imported procedure p whereas, being imported, 
the constraints that are used for p are those obtained by analysing p(T, . . . , T). 



5 Example 

In this section, we present an example, and use it to discuss to what extent the 
proposed analysis is also applicable in the context of Prolog. 



5.1 A Simple Interpreter 

Consider the simple interpreter for arithmetic expressions depicted in Figure 10, 
adapted from a Prolog version discussed and specialized in a companion chapter 
[29]. The program consists of a number of type definitions and two predicates. 



type env — > nil ; consCelem, env) . 
type elem — > pair(ident,int) . 

type exp — > cst(int) ; var(ident) ; + (exp, exp) . 

pred lookup(ident , env, int) . 
mode lookup (in, in, out) is multi. 

lookup (V, E, Val) E=>i cons(A,As), A=>2 pair(I,VI), ( 

V==3 I, Val :=4 VI 

lookup (V, As, T) 5 , Val :=6 T) . 

pred int (exp, env, int) . 
mode int(in,in,out) is multi. 

int (E,Env,R) : -( 

E=>i cst(C), R:=2 C 

E=>3 var(V), lookup (V, Env, Val) 4 , R:=s Val 

E =>6 +(A,B), int(A,Env,Rl)7 , int(B,Env,R 2 )s , plus(Rl,R 2 ,R) 9 ) . 



Fig. 10. A simple interpreter 
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The type env defines an environment as a list of elements, each element being 
a pair (type elem) consisting of an identifier (type ident) and an integer (type 
int). We assume that the types ident and int are atomic and builtin. The type 
exp defines an expression as either a constant integer, a variable denoted by an 
identifier, or the sum of two expressions. 

The predicate lookup/3 takes an identifier and an environment as input, 
searches the value associated to the identifier in the environment and returns this 
value or fails. Note that the predicate is defined as being non-deterministic in 
order to mimick a purely declarative implementation in Prolog. The interpreter 
itself is represented by the predicate int which takes an expression and an 
environment as input and returns the value of the expression or fails. Both 
predicates are given in superhomogeneous form. 

After call-independent analysis, the binding-time constraints associated with 
the lookup/3 predicate are as follows. 

^ J^{{cons,l)) 

As >: E 

J )s- ^{{<^ons,l) ,{pair,l)) 

I>* E 

YJ y J^{{cons,l),{pair,2)) 

VI h* E 

Vali >- £;((co"s.l).(pa*»’.2)> 

Vah E Li £;dcons,l)> y J^{{cons,l),{pair,l)) y y 

^ )s- ^{{cons,l)^{jpair^2)) 

X EL\ y ]^{{cons,l),{pair,l)) y y 

Vale >- i^((co"s,l).(pa*»’.2)> 

Vale ELI ijdcons,!)) y J^((cons,l),(pair,l)) y y 

All constraints are in normalised form. Where relevant, a binding-time variable 
is indexed by a subscript indicating the program point at which the constraint 
holds. Recall that the ^-constraints express the regular data flow, whereas the 
^*-constraints reflect the specialisation-strategy: a constraint X y* denotes 
that the binding-time of X cannot be static if the node 6 in the binding-time 
of Y is marked dynamic. Such a constraint is due to the presence, earlier in the 
predicate, of a deconstruction (or test) on Y^ that may be residualised and subse- 
quently fail at run-time. The interpretation of these constraints is as follows. The 
data-flow (or constraints are obtained in a straightforward way, by projecting 
the constraints obtained from the unifications. The strategy (or ^*) constraints 
are somewhat more involved. The constraints I Y* E and VI Y* E denote that 
I and VI must be T in case E is not bound to an outermost functor. Indeed, if 
E is not bound to an outermost functor, the deconstruction at program point 1 
cannot be reduced at specialisation-time and the atom at program point 2 (in 
which I and VI are assigned their value) is under dynamic control and hence 
cannot be reduced at specialisation time. Subsequently, the construction at pro- 
gram point 4 is under dynamic control if one of the preceeding atoms cannot be 
reduced or results in code that may fail at runtime, which is the case if either 
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the environment E, the elements of the environment (_E;Ucons,i)>)^ identifiers 
within each such element (i<;<('=°"^d),(p“*r,i)) variable V is not bound to an 

outermost function. Similar considerations explain the constraints on T and 
V al at program point 6 in the other branch of the disjunction. The constraints on 
T are equal to the least upper bound of those (in the least fixed point) on V ah 
and Vale- Recall that the constraints on T, which originate from the recursive 
call, are obtained from T '^Vah^Vah. 

The binding-time constraints derived for the int/3 predicate are as follows. 

(J j^{(cst,l)) 

i?2 E 

Y y j^{{var,l)) 

Val > 

Val >* EnV U U £;rj^,((co"s.l),(pair.l)) y ^(Rar.l)) 

i?5 £:j^x;((co"s,1).(p“*»’.2)> 

R^h* EU EnV U y £^^^((cons,l).(pa»r,l)> y ^{(var,l)) 

A t E((+’^^^ 

B t 

R1 >- £;d+d).(csi.l)> y Y]j^y{(cons,l),(pair,2)) 

Rl)^* EU iffi+'b) U ^((+.2)> y J^{{+,l),{var,l)) y ^((-|-,2),(i;ar,l)) y 
EnV U y ^^y{{cons,l),ipair,l)) 

R2 >~ £;d+>2).(csi.l)> y Y]j^y{(cons,l),(pair,2)) 

R2^* EU ijfi+'l)) u u y ^((-|-, 2 ),(i>ar,l))y 

EnV U £:nz;<b0"s.l)) y ^^y{{cons,l),ipair,l)) 

Rq >- _Ed+.l),(cst,l)> y ^((-|-,2),(cst,l)) y j^^^{(cons,l),(pair,2)) 

Rg)^* EU U _E<(+’2)> U £:((+. l),(«ar,l)) y ^((-|-,2),(i;ar,l)) y 

EnV U £:nU<bo"s.l)) y E^y{(cons,l),{pair,l)) 

These constraints are obtained in a similar way as those for the lookup predicate. 
Assume we want to specialise this program for the query 

int(+(cst(2) ,+(var(x) ,cst(3))) , [pair (y ,Yval) , (x,Xval)] ,Res) (1) 

i.e., the expression to compute is fully instantiated and the domain of the en- 
vironment mapping is fully defined but the concrete values associated to the 
identifiers are as yet unknown. These degrees of instantiation are expressed by 
the binding-times j3exp defined for the type exp and j3env defined for the type 

env. 






exp — 



{{), static), {{{cst,l)), static), {{{var,l)) , static) 

(((-I-, 1)), static), (((-I-, 2)), static) 



P 



env — 



(0, static) 

(((cons, 1), (pair, 1)), static) 
({(cons, 1), (pair, 2)), dynamic) 
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Note that the abstract call int {f3exp , Penv , -) will give rise to an abstract call 
lookup (static, /Sen?; ,-) • In the least solution of the constraints for lookup with 
respect to this call, we obtain that the output argument Val = V al 4 U V gIq = 
dynamic. However, the input to each test or deconstruction in lookup is at 
least bound to an outermost functor and hence is a candidate for reduction. In 
addition, if we look at the strategy constraints 
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we derive that none of the atoms is under dynamic control and consequently, 
each atom can be annotated as reducible. 

Consequently, for the int predicate we obtain R = dynamic but similarity 
to the case of the lookup predicate, none of the atoms is under dynamic control 
and the input to each unification is bound to at least an outermost constructor. 
Hence all unifications can be reduced. Only the predicate plus, which we assume 
builtin, has both input arguments dynamic and need to be residualised. The 
result of specialisation using the obtained annotations is the residual program 
int(Xval,Yval,Res) plus(Xval,3,T) , plus(2,T,Res) . 



5.2 The Prolog Case 

The basic characteristic of Mercury that make this work feasible is the presence 
of type- and mode information. Hence, one may ask to what extent the technique 
can be carried over to the analysis of (pure) Prolog programs. Let us assume 
that the same type information as above is available. Given that the normal use 
of the int/3 predicate is with mode (i,i,o), a mode analysis is able to show 
that lookup/3 is also called with mode (i , i , o) and that both predicates return 
a ground answer. Taking care that variables in output positions of predicates are 
first occurrences (hence free variables) one can obtain a normalisation that is 
almost a replica of the Mercury code. 

lookup(V,E,Val) :-E=cons(A,As) , A=pair (1 ,V1) , V=l, Val=Vl . 
lookup(V,E,Val) :-E=cons(A,As) , A=pair (1 ,V1) , lookup (V, As, T) , 

Val=T. 



int(E,Env,R) :-E = cts(C), R=C. 

int(E,Env,R) :-E = var(V), lookup (E,Env, Val) , R=Val . 
int(E,Env,R) :-E = +(A,B), int (A,Env,Rl) , int(B,Env,R2) , 
is+(Rl ,R2,R) . 
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Using the mode information about the variables participating in unifications, 
one could classify them into tests, assignments, constructions and deconstruc- 
tions as in the Mercury code. There is one difference. In the case of Mercury, 
assignments and constructions are guaranteed to succeed. In the case of our 
mode analysis, a variable not having mode input can still be partially intanti- 
ated, hence the unfication could fail at run-time. This will not happen in the 
example at hand. Indeed a simple local analysis shows that the variables being 
assigned are effectively free. E.g. in Val=VI, V al is the first occurrence of the 
output variable. Whether a unification 77 can fail has to be properly encoded in 
the special binding-time analysis variable TZrj- Apart from this, given the type 
information and the specification of the query to be specialised, the binding time 
analysis as done for Mercury can be performed, leading to the same annotations 
and hence, a specialiser as logen [30] could derive the same specialised code. 

Finally, it is feasible to handle more complex modes than simply input and 
output. In [7], a more refined mode analysis, called rigidity analysis is developed. 
Given a term t of type t, it considers all subtypes t' of r. The term is r'-rigid 
if it cannot have a well-typed instance that has a variable as a subterm of type 
t' . Such a type based rigidity analysis can provide more detailed mode informa- 
tion that has the potential to contribute to a better binding-time analysis. For 
example, such an analysis could show that a term of type elem (cnfr. the simple 
interpreter) that is not ground, is ident-rigid. 

To conclude the discussion of this example, we note that — within the context 
of Prolog - the results obtained by the binding-time analysis could be directly fed 
to the LOGEN offline partial deduction system [25,30]. This system uses the notion 
of a binding-type to characterise specialisation-time values. Basic binding-types 
are static — characterising a value as ground — and dynamic - characterising a 
value as possibly non-ground - but more involved binding-types can be declared 
by the user using binding-type rules, much in the same way as types are declared 
by type rules. 

In the interpreter example, the binding-times (dexp and fdenv could be trans- 
lated to the following binding- type definitions: 

type exp > cst(static) ; var(static) ; +(static, static) . 

type elem — > pair (static, dynamic) . 

type env > nil ; cons (elem, env) . 

Input to the logen system would then consist of the program in which every call 
is annotated as reducible (by means of the unfold annotation [25,30]) together 
with the binding- type classification of the query int (exp, env, dynamic). In the 
companion chapter [29] we present in more detail how this example program can 
be specialized using the logen system and the so-derived annotations. Further 
work is needed to investigate whether our binding-time analysis can be adapted 
for the Prolog setting with logen’s binding- types. 
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6 Discussion 

Constraint based (binding-time) analysis has been considered before. In [17], 
Henglein develops such a constraint-based (higher-order) binding-time analysis 
for A-calculus by viewing the problem as a type inference problem for annotated 
A-terms in a two-level A-calculus. A set of constraints capturing local binding- 
time requirements is created and transformed into a normal form. A solver is 
used to find a consistent minimal binding-time classification. The analysis is re- 
developed, concentrating on the aspect of poly variance, for a PCF-like language 
in [19]. Henglein’s analysis is scaled up by Bondorf and Jprgensen in [5], where 
they construct three (monovariant) analyses to be used in the partial evalua- 
tor Similix [4]. An important conceptual advantage, mentioned among others in 
[5], of doing binding-time analysis by constraint normalisation is the fact that 
the constraint based approach is viewed as a more elegant description of the 
analysis, compared with a direct abstract interpretation approach in which the 
source code is abstractly interpreted over the domain of binding-times. Indeed, 
in the constraint-based approach, problem and solution are separated: the con- 
straint system expresses the binding-time requirements on the involved variables, 
whereas actual binding-times are contained in a solution to the constraint sys- 
tem. A practical consequence of this separation is that the data flow analysis, 
being performed at the symbolic level, needs to be performed only once for each 
predicate (in a first-order setting) rather than performing a separate analysis 
for every (abstract) call to the predicate. This result extends - at least to some 
extent - to a higher-order setting in the sense that the data flow analysis needs 
to be performed only once for each combination of a predicate with the closure 
information from its arguments. 

In this work, we have shown that a constraint-based approach is also feasible 
for the logic programming language Mercury. The available type information al- 
lows to construct a precise domain of binding-times, whereas the available mode 
information allows to express the data flow constraints in a sufficiently precise 
way. Apart from being modular, the resulting analysis is poly variant, and able 
to deal with partially instantiated data structures. A prototype implementation 
of the analysis was made and in [49] we describe some experiments that show 
the practical feasibility of the analysis. An interesting topic for further research 
is to couple the binding-time analysis with an offline specialiser and to perform 
experiments to determine the obtainable speedups. 

Strongly related to our domain of binding-times is the domain proposed and 
used by Launchbury [28] who defines a system of types and derives a finite 
domain of projections over each type. Such a projection maps a value to a part 
of the value that is definitely static, as such “blanking” out the dynamic part. 
In recent work [3,2], a binding-time analysis is presented for the lambda calculus 
that allows an expression to be both static and dynamic at the same time; 
the general idea is to be able to access statically the (static) components of a 
residualised data structure. The exact relation and/or integration with a fine- 
grained domain of binding-times as employed by our technique is an interesting 
topic for further research. 
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Upgrading binding-time analysis to deal with Mercury’s higher-order con- 
structs requires closure information. In the literature, also closure analysis has 
been formulated by means of abstract interpretation [4,9] as well as by con- 
straint solving [16,35,18]. Bondorf and Jprgensen [5] develop a constraint-based 
flow analysis that traces higher-order flow as well as flow of constructed (first- 
order) values. In this work, we have combined closure analysis with binding-time 
analysis and used constraints to express the first-order as well as the higher-order 
data flow. We have enhanced the domain of binding-times to include a set of clo- 
sures that represents the binding-time of a higher-order value, and formulated 
the constraint-generation phase as a call dependent process in which however 
only the higher-order parts of the call pattern determine the result of the anal- 
ysis. During constraint generation, the constraints involving higher-order values 
are evaluated, and the resulting closure information is used to decide what con- 
straints to incorporate, possibly propagating closure information down into the 
called procedures. 

We have discussed in detail how the analysis can be applied to multi-module 
programs according to a one module at a time scenario in Sections 3.5 and 4.3. 
If we do not wish to propagate closure information over module boundaries, the 
constraint generation phase can be performed one module at a time, bottom- 
up in the module hierarchy. Remaining issues are precisely such inter-module 
closure propagation and the handling of circularities in the module hierarchy. 
Recent work [8] presents a framework for the (call-dependent) analysis of multi- 
module programs that solves both problems. The key invariant in the approach 
of [8] is that at each stage of the process, the analysis results are correct, but 
reanalysis may - when more information is available - produce more accurate 
results. The analysis performs some extra bookkeeping such that, when a mod- 
ule is analysed, it records both the call patterns occurring in the calls to the 
imported procedures, and the analysis results of the module’s exported proce- 
dures. When the recorded information contains new calls (or calls with a more 
accurate call pattern) to the imported modules, the analysis may decide to re- 
analyse the relevant imported modules with respect to the more accurate call 
patterns. Likewise, the recording of more accurate analysis results for a mod- 
ule’s exported procedures can trigger the reanalysis of those modules that would 
possibly profit from these more accurate results. Note that our binding-time 
analysis neatly fits such an approach: initially, a module’s exported procedures 
are analysed with respect to T (no closure information is available) . The result- 
ing binding-time constraint systems are correct, but could possibly be rendered 
more precise, when the procedures are (re)analysed with respect to a more ac- 
curate call pattern (one that does contain some closure information). To the 
best of our knowledge, the binding-time analysis of modular programs has been 
considered only occasionally before. Henglein and Mossin [19] note that a sym- 
bolic representation of binding-times allows a modular approach. Based on such 
a symbolic analysis, [12] present a method to specialise a multi-module program 
- written in a simple yet higher-order functional language - by constructing, 
for each of the modules, a generating extension, while using only the result of a 
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call-independent binding-time analysis. The analysis assumes that annotations 
indicating whether a function must be unfolded are given by hand and is re- 
stricted to module hierarchies without circular dependencies. 

To summarise, we can state that few binding-time analyses have been de- 
veloped that are polyvariant, deal with partially instantiated data, modules 
and higher-order constructs for a realistic language. Our binding-time analy- 
sis achieves this for the Mercury language by combining a number of known 
techniques: partially instantiated structures are dealt with by incorporating a 
structured and precise domain of binding-times, polyvariance and modularity 
are achieved by computing the binding-times symbolically and higher-order in- 
formation is incorporated by propagating closure information during the sym- 
bolic phase of the analysis. Two important limitations of our technique are in 
the modularity of the approach, in particular the lack of propagation of closure 
information over module boundaries and the handling of circularities in the mod- 
ule dependency graph. Fortunately, both issues can be addressed by imposing a 
system like [8] on top of our technique. 
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Abstract. Context-sensitive analysis provides information which is po- 
tentially more accurate than that provided by context-free analysis. Such 
information can then be applied in order to validate/debug the program 
and/or to specialize the program obtaining important improvements. 
Unfortunately, context-sensitive analysis of modular programs poses im- 
portant theoretical and practical problems. One solution, used in several 
proposals, is to resort to context-free analysis. Other proposals do address 
context-sensitive analysis, but are only applicable when the description 
domain used satisfies rather restrictive properties. In this paper, we ar- 
gue that a general framework for context-sensitive analysis of modular 
programs, i.e., one that allows using all the domains which have proved 
useful in practice in the non-modular setting, is indeed feasible and very 
useful. Driven by our experience in the design and implementation of 
analysis and specialization techniques in the context of CiaoPP, the Ciao 
system preprocessor, in this paper we discuss a number of design goals for 
context-sensitive analysis of modular programs as well as the problems 
which arise in trying to meet these goals. We also provide a high-level 
description of a framework for analysis of modular programs which does 
substantially meet these objectives. This framework is generic in that 
it can be instantiated in different ways in order to adapt to different 
contexts. Finally, the behavior of the different instantiations w.r.t. the 
design goals that motivate our work is also discussed. 



1 Introduction 

Analysis of logic programs has received considerable theoretical and practical 
attention. A number of successful compile-time techniques have been proposed 
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and implemented which allow obtaining useful information on the program and 
using such information to debug, validate, and specialize the program, obtain- 
ing important improvements in correctness and efficiency. Unfortunately, most 
of the existing techniques are still only used in prototypes and, though numer- 
ous experiments demonstrate their effectiveness, they have not made their way 
into existing real-life systems. Perhaps one of the reasons for this is that most of 
these techniques were originally designed to be applied to a complete, monolithic 
program, while programs in practice invariably have a more complex structure 
combining a number of user modules with system libraries. Clearly, organiz- 
ing program code in this modular way has many practical advantages for both 
program development and maintenance. On the other hand, performing global 
techniques such as program analysis on modular programs differs from doing so 
in a monolithic setting in several interesting ways and poses non-trivial problems 
which must be solved. 

In this work we concentrate on strict module systems in which procedures 
external to a module are visible to it only if they are part of its interface. The 
interface of a module usually contains the names of the exported procedures and 
the names of the procedures imported from other modules. The module can only 
import procedures which are among the ones exported by the other modules. 
Procedures which are not exported are not visible outside the module. 

Driven by our experience in the design and implementation of context-sensi- 
tive analysis and specialization techniques in the CiaoPP system [20,9], in this 
paper we present a high level description of a framework for analysis of modular 
programs. This framework is generic in that it can be instantiated in different 
ways in order to adapt to different contexts. The correctness, accuracy, and 
efficiency of the different instantiations is discussed and compared. 

The analysis of modular programs has been addressed in a number of previous 
works. However, most of them have focused on specific analyses with particu- 
lar properties and using more or less ad-hoc techniques. In [6] a framework is 
proposed for performing compositional analysis of logic programs in a modu- 
lar fashion, using the concept of an open program, introduced in [2]. An open 
program is a program in which part of the code is not available to the ana- 
lyzer. Nevertheless, this interesting framework is valid only for a particular set 
of abstract domains of analysis — those which are compositional. 

Another interesting framework for compositional analysis for logic programs 
is presented in [23], in this case, for binding-time analysis. Although the most 
natural way to describe abstract interpretation-based binding-time analyses is 
arguably to use a top-down, goal-dependent framework, in this work a goal- 
independent analysis framework is used in order to simplify the handling of the 
issues stemming from modularity. The choice is based on the fact that context- 
sensitivity brings important problems to a top-down analysis framework. Both 
this paper and [6] stress compositionality as a very attractive property, since 
it greatly facilitates modular analysis. However, there are many useful abstract 
domains which do not meet this property, and thus these approaches are not of 
general applicability. 
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In [15] a control-flow analysis-based technique is proposed for call graph con- 
struction in the context of object oriented languages. Although there has been 
other work in this area, the novelty of this approach w.r.t. previous proposals 
is that it is context-sensitive. Also, [1] shows a way to perform modular class 
analysis by translating the object oriented program into open DATALOG pro- 
grams, in the sense of [2]. These two contributions are tailored to specific analysis 
domains with particular properties, so an important part of their work is not 
generally applicable nor reusable in a general framework. 

In [21] a two-phase analysis is proposed for incomplete imperative programs, 
starting with a fast, imprecise global analysis and then continuing with a (possi- 
bly context sensitive) analysis for each module in the program. This approach is 
not abstract interpretation-based. It is interesting to see that it appears to follow 
from the theory of abstract interpretation that if in such a two-pass approach 
the first pass “overshoots” the fixed-point, the maximum precision may not be 
recovered in the second pass. 

In [22] a method for performing separate control-flow analysis by means of 
abstract interpretation is proposed. This paper does not deal with the inter- 
modular approach studied in the present work, although it does have points in 
common with our module-aware analysis framework (Section 5). However, in 
this work the initial information needed by the abstract interpretation-based 
analyzer is provided by other analysis techniques (types and effects techniques), 
instead of taking advantage of the actual results from the analysis of the rest of 
the modules in the program. 

A preliminary study of the extension of analysis and specialization to the 
case of modular programs was presented in [19]. A full practical proposal for 
modular program analysis was presented in [4], which also presented some pre- 
liminary data from its implementation in the context of the Ciao system. Also, 
an implementation of [4] in the context of the HAL system [8] has been reported 
in [14]. 

The rest of the paper proceeds as follows: Section 2 presents a review of pro- 
gram analysis based on abstract interpretation and of the non-modular frame- 
work that we use as a starting point. Section 3 then presents some additional 
notation related to modular programs and a first, simple approach to extending 
the framework to handling such modular programs: the “flattening” approach. 
This approach is used as baseline for comparison throughout the rest of the 
paper. Section 4 then identifies a number of characteristics that are desirable 
of a modular analysis system and which the simple approach does not meet in 
general. Achieving (at least a subset of) these characteristics justifies the more 
involved approach presented in the rest of the paper. To this end. Section 5 first 
discusses the modifications made to the analysis framework for non-modular pro- 
grams in order to be able to handle one module at a time. Section 6 then presents 
the actual full framework for analysis of modular programs. The framework pro- 
posed is parametric on the scheduling policies. The following sections discuss two 
scheduling policies which are fundamentally different: manual scheduling (Sec- 
tion 7), which corresponds to a scenario where one or more users decide when 
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and what modules to analyze individually (but in a context-sensitive way), such 
as in distributed program development, and automatic scheduling (Section 8 ), 
where a full scheduling policy automatically determines in which order the mod- 
ules will be analyzed and continues until the process is completed (a fixed-point 
is reached). Section 9 addresses some practical implementation issues, including 
persistence and handling of libraries. Finally, Section 10 compares the behavior 
of the different instantiations of the generic framework proposed together with 
that of the flattening approach w.r.t. the desirable design features discussed in 
Section 4, and presents some conclusions. 

2 A Non-modular Context-Sensitive Analysis Framework 

The aim of context-sensitive program analysis is, for a particular description 
domain, to take a program and a set of initial call patterns and to annotate the 
program with information about the current environment at each program point 
whenever that point is reached when executing calls described by the initial call 
patterns. 



2.1 Program Analysis by Abstract Interpretation 

Abstract interpretation [7] is a technique for static program analysis in which 
execution of the program is simulated on a description (or abstract) domain 
(Da) which is simpler than the actual (or concrete) domain (D). Values in the 
description domain and sets of values in the actual domain are related via a pair 
of monotonic mappings ( 0 , 7 ): abstraction a : 2^ Da and concretization 7 : 
Da^ 2^ which form a Galois connection, i.e. 

Va; G 2^ : 3 x and VA G Da ■ a( 7 (A)) = A. 

The set of all possible descriptions represents a description domain Da which is 
usually a complete lattice or cpo for which all ascending chains are finite. Note 
that in general G is induced by C and a (in such a way that VA, A' G Da ■ A G 
A' 7 (A) C 7 (A')). Similarly, the operations of least upper hound (U) and 
greatest lower bound (□) mimic those of 2^ in some precise sense. A description 
A G Da approximates a set of concrete values x G 2^ if a{x) Q A. Correctness of 
abstract interpretation guarantees that the descriptions computed approximate 
all of the actual values which occur during execution of the program. 

Different description domains may be used which capture different properties 
with different accuracy and cost. Also, for a given description domain, program, 
and set of initial call patterns there may be many different analysis graphs. 
However, for a given set of initial call patterns, a program and abstract operations 
on the descriptions, there is a unique least analysis graph which gives the most 
precise information possible. 
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2.2 The Generic Non-modular Analysis Framework 

We will now briefly describe the main ingredients of a generic context-sensitive 
analysis framework which computes the least analysis graph. This framework 
generalizes the particular analysis algorithms used in systems such as PLAI 
[12,13], GAIA [5], and the CLP(T^) analyzer [11], and we believe captures the 
essence of most context-sensitive, non-modular analysis systems. More details 
on this generic framework can be found in [10,17]. 

We first introduce some notation. CD and AD stand for descriptions in the 
abstract domain. The expression P : CD denotes a call pattern. This consists of 
a predicate call together with a call description for that predicate call. Similarly, 
P : AD denotes an answer pattern, though it will be referred to as AD when it 
is associated to a call pattern P : CD for the same predicate call. 

The least analysis graph for the program is implicitly represented in the 
algorithm by means of two data structures, the answer table and the dependeney 
table. Given the information in these data structures it is straightforward to 
construct the graph and the associated program point annotations. The answer 
table contains entries of the form P : CD i— > AD. It is interpreted as: the answer 
pattern for calls of the form CD to P is AD. A dependency is of the form 
P : CDo Bkey ■ CD\. This is interpreted as follows: if the procedure P is 
called with description CDq then this causes the procedure B to be called with 
description CDi . The subindex key can be used in order to uniquely identify the 
program point within P where B is called with calling pattern CD\ . Dependency 
arcs represent the arcs in the program analysis graph from procedure calls to 
the corresponding call pattern. 

Intuitively, different analysis algorithms correspond to different graph traver- 
sal strategies which place entries in the answer table and dependency table as 
new nodes and arcs in the program analysis graph are encountered. To capture 
the different graph traversal strategies used in different fixed-point algorithms, 
we use a priority queue. The queue contains the events to process. Different pri- 
ority strategies correspond to different analysis algorithms. Thus, the third, and 
final, structure used in our generic framework is a tasks queue. 

When an event being added to the tasks queue is already in the queue, a single 
event with the maximum of the priorities is kept in the queue. Also, only one 
arc of the form P : CD Bkey '■ CD' for each tuple (P, CD, Bkey) exists in the 
dependency table: the last one added. The same holds for entries P : CD i— > AD 
for each tuple (P, CD) in the answer table. 

Figure 1 shows the architecture of the framework. The Code corresponds to 
the (source) code of the program to be analyzed. By Entries we denote the initial 
starting points for analysis. The box Description Domain Operations represents 
the definition of operations which are domain dependent. The circle represents 
the Analysis Engine, which has the three data-structures mentioned above, i.e., 
the answer table, the dependency table, and the tasks queue. Initially, for each 
analysis these three structures are empty and the analysis engine takes care of 
processing the events on the priority queue by repeatedly removing the high- 
est priority event and calling the appropriate event-handling function. This in 
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Fig. 1. Non-Modular Analysis Framework 



turn consults and modifies the contents of the answer and dependency tables. 
When the tasks queue becomes empty then the analysis engine has reached a 
fixed-point. This implies that the least analysis graph has been found. We will 
use Analysis ]j {Q,E) = (AT,DT) to denote that the analysis of program Q 
for initial descriptions E in domain Da produces the answer table AT with 
dependency table DT. 

2.3 Predefined Procedures 

In order to simplify their presentation, formalizations of program analysis often 
do not consider predefined procedures. However, in practice, program analysis 
implementations allow the use of predefined (language built-in and/or library) 
procedures® in the programs to be analyzed. These external procedures whose 
code is not available in the program being analyzed are often handled in an ad- 
hoe way. Thus, in fairness, non- modular program analyses are more accurately 
represented by adding to the framework a builtin procedure function which es- 
sentially hardwires the answer table for these external procedures. This function 
is represented in Figure 1 by the box builtin procedure function. We will use CV 
and AV to denote, respectively, the set of all call patterns and the set of all 
answer patterns. The builtin procedure function can be formalized as a function 
BE : CV — *■ AV. For all call pattern P : CD where P is a builtin procedure 
BE{P : CD) returns a description AD which is assumed to be correct in the 
sense that it is a safe approximation, i.e. an over-approximation of the actual 
answer pattern for P : CD. 

It is important to note that the data structures which are outside the anal- 
ysis engine, code, entries, description domain operations, and builtin procedure 
function are read-only. However, though the code and entries are supposed to 

® In our modular design, a library can be treated simply as (yet another) module in 
the program. However, special practical considerations for them will be discussed in 
Section 9.3. 
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change for the analysis of each particular program, the builtin procedure funetion 
can be considered to be fixed, for each description domain Da, in that it does not 
vary from the analysis of one program to another. Indeed, it can be considered 
to be part of the analyzer. Thus, the builtin procedure function is not explicitly 
represented as an input to the analysis algorithm. 



3 The Flattening Approach to Modular Processing 

We start by introducing some notation. We will use m and m' to denote mod- 
ules. Given a module m, by imports(rn) we denote the set of modules which 
m imports. Figure 2 presents a modular program. Modules are represented as 
boxes and there is an arrow from m to m! iff m imports m' . In our example, 
imports{a) = {b, c}. By depends(jn) we refer to the set generated by the transi- 
tive closure of imports, i.e. depends {m) is the least set such that imports{m) C 
depends (m) and m' G depends (m) implies that imports{m') C depends (m). In 
our example, depends(a) = {b, c, d, e, /}. Note that there may be circular depen- 
dencies among modules. In our example, e G depends(d) and d G depends{e) . A 
module m is a leaf if depends{m) = 0. In our example, the only leaf module is 
/. By callers (m) we denote the set of modules which import m. In the example, 
callers(e) = {b,c,d}. Also, we define related{m) = callersfm) U imports{m). In 
our example, related{b) = {a,d,e}. 

The program unit of a given module m is the finite set of modules con- 
taining m and the modules on which m depends: program_unit(jn) = {to} U 
depends{m) . to is called the top-level module of its program unit. In our exam- 
ple, program_unit{a) = (a, b, c, d, e, /} and program_unit{c) = (c, d, e, /}. A pro- 
gram unit U is self-contained in the sense that W m G U : m' G imported{m) — > 
to' G U . 




Fig. 2. An Example of Module Dependencies 



Several compilation tasks such as program analysis and specialization are 
traditionally considered global, as opposed to local. Usually, local tasks process 
one procedure at a time and all the information required for performing the task 
can be obtained by inspecting that procedure. In contrast, in global tasks the 
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results of processing a part of the program (say, a procedure) may be needed 
in order to process other parts of the program. Thus, global processing often 
requires iterating on the whole program until a fixed-point is reached. 

In a modular setting, it may well be the case that part of the information 
needed to perform the task on (a procedure in) module m has to be computed 
in modules other than m. We will refer to the information originated in modules 
different from m as inter-modular information in contrast to the information 
originated in m itself, which we will call intra-modular. 

Example 1. In context-sensitive program analysis there is an information flow 
of both call and success patterns to and from procedures in different modules. 
Thus, program analysis requires inter-modular information. For example, the 
module c receives call patterns from module a since callers{c) = {a}, and it 
has to propagate the corresponding success patterns to a. In turn, c provides 
{e, /} = imports (c) with call patterns and receives success patterns from them. 



3.1 Flattening a Program Unit Vs. Modular Processing 

Applying a framework for non-modular programs to a module m has the diffi- 
culty that m may not be self-contained. However, there should be no problem 
in applying the framework if m is a leaf module. Furthermore, given a global 
process such as program analysis, at least in principle, it is not obvious that 
it makes much sense to apply the process to a module m alone. In principle, 
it makes more sense to apply it to program units since they are conceptually 
self-contained. Thus, given a module m one natural approach seems to be to 
apply the tool (simultaneously) to all the modules in U = program-unit{m) . 

Given a program unit U it is always possible to build a single module m fiat 
which is equivalent to U and which is a leaf. The process of constructing such a 
module m fiat usually only amounts to renaming apart identifiers in the different 
modules in C/ so as to avoid name clashes. We will use flatten{U) = mfiat to de- 
note that the module m fiat is the result of renaming apart the code in each mod- 
ule in U and concatenating its code into a monolithic module mfiat- This points 
to a simple solution to the problem of processing modular programs (at least 
for the case in which all the code is available): to transform pr o gram -unit {m) 
into the equivalent monolithic program mfiat- It is then straightforward to apply 
any tool for non-modular programs to the leaf module mfiat- Figure 3 represents 
the case in which the non-modular analysis framework is used on the flattened 
program. 

Given the existence of an implementation for non-modular analysis, this ap- 
proach is often simple to apply. Also, this flattening approach has theoretical 
interest. It can be used, for example, in order to compare the efficiency of differ- 
ent approaches to modular handling of programs w.r.t. the flattening approach. 
However, as a practical way in which to actually perform analysis of program 
units this approach has important drawbacks. This issue will be discussed in 
more detail in Section 10. 
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Fig. 3. Using non-modular analysis on a flattened program 



4 Design Goals for Analysis of Modnlar Programs 

Before presenting our proposals for analysis of modular programs, we will discuss 
the main features which should be taken into account when designing and/or 
implementing a tool for context-sensitive analysis of modular programs. As often 
happens in practice, some of the features presented are conflicting with others 
and this might make it impossible to And a framework which behaves optimally 
w.r.t. all of them. 

Module-Awareness We consider a framework module-aware when it has been 
designed with modules in mind. Thus, it is applicable to a module m by using the 
code of m and some “interface” information for the modules in imports (m). Such 
interface information will in general consist of a summary of previous analysis 
results for such modules, if such results are available, or a safe approximation if 
they are not. 

Though transforming a non-modular framework into a module-aware one 
may seem trivial, it requires identifying precisely which is the required informa- 
tion on the result of applying the tool in each of the modules in imports (m) 
which should be stored in order to apply the tool to m. This corresponds in 
general to the inter-modular information. It is also desirable that the amount of 
such information be minimal. 

Example 2. The framework for non-modular analysis in Section 2 is indeed non- 
modular since it requires the code of all procedures (except possibly for some 
predefined ones) to be available to the analyzer. It will produce wrong results 
when applied to non-leaf modules since a missing procedure can only be deemed 
as an error, unless the framework is aware that such a procedure can be imported. 

Correctness The results of applying the tool to a module m should produce 
results which are correct. The notion of correctness itself can in general be lifted 
from the non-modular case to the modular case without great difficulties. A more 
complex issue is how to extend a framework to the modular case in such a way 
that correctness is preserved. 
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Accuracy Similarly, the analysis results for a module m should be as accurate 
as possible. The notion of accuracy can be defined by comparing the analysis 
results with those which would be obtained using the flattening approach pre- 
sented in Section 3.1 above, since the latter always computes the most accurate 
information possible, which corresponds to the least analysis graph. 

Termination A framework for analysis of modular programs should guarantee 
termination (at least) in all cases in which the flattening approach terminates 
(which, typically, is for every program) . Such termination is guaranteed by choos- 
ing description domains with some specific characteristics such as having finite 
height, finite ascending chains, etc., and/or incorporating a widening operator. 



Effieiency in Time The time required to apply the tool should be reasonable. 
We will understand “reasonable” as not over an acceptable threshold on the time 
taken using the flattening approach. 

Efficiency in Memory In general, one of the main expected advantages of the 
modular approach is that the total amount of memory required to handle each 
module separately should be smaller than that needed in the flattening approach. 

No Need for Analyzing All Call Patterns Under certain circumstances, applying 
a tool on a module m may require processing only a subset of the call patterns 
rather than all call patterns for m. In order to achieve this, the model must keep 
track of fine-grained dependencies. This will allow marking exactly those call 
patterns which need processing. Other call patterns not marked do not need to 
be processed. 

Support for the Co-Existence of Multiple Program Units /Applications In a mod- 
ular setting it is often the case that a particular module is used in several ap- 
plications. Support for software reuse is thus a desirable feature. However, this 
poses additional and interesting challenges to the tools, some of which will be 
discussed in Section 9. 

Support for Source Changes What happens if the source of a module changes 
during processing? Some tools will not allow this at all and if it happens all 
the processing has to start again from scratch. This has the disadvantage that 
the tool is then not incremental since a (possibly minor) change in a module 
invalidates the information for all the program unit. Other tools may delete the 
information which may depend on the changed code, but still keep the informa- 
tion which does not depend on it. 

Persistence This feature indicates that the inter-modular information can be 
stored in a persistent medium, such as a file stored on disk or a database, and 
allow later recovery of such information. 
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Fig. 4. Module-aware analysis framework 



5 Analysis of Modular Programs: the Local Level 

As a first step towards introducing our analysis framework for modular programs, 
which will be presented in Section 6 below, in this section we discuss the main 
ingredients which have to be added to an analysis framework for non-modular 
programs in order to be able to handle one module at a time. 

Analyzing a module separately presents the difficulty that, from the point 
of view of analysis, the code to be analyzed is incomplete in the sense that the 
code for procedures imported from other modules is not available to analysis. 
More precisely, during analysis of a module m there may be calls P : CD such 
that the procedure P is not defined in m but instead it is imported from another 
module m' S imports {m). We refer to determining the value of AD to be used for 
P : CD I— > AD as the imported success problem. In addition, in order to obtain 
analysis information for m' which is as accurate as possible we need to somehow 
propagate the call P : CD to m! so that the next time m' is analyzed such a 
call pattern is taken into account. We refer to this as the imported calls problem. 
Note that in this case analysis has to be module-aware in order to determine 
whether a given procedure is either local or imported (or predefined). 

Figure 4 shows the architecture of an analysis framework which is module- 
aware. This framework is an extension of the non-modular framework in Figure 1. 
One minor change is that the read/ write data structures internal to the analysis 
engine have been renamed with the prefix “local” . So now we have the local 
answer table, the local dependency table, and the loeal task queue. Also, the box 
which represents the code now contains m indicating that it contains the single 
module m. 

The shaded boxes in Figure 4 indicate the main differences w.r.t. the non- 
modular framework. One is that in the module-aware framework there is an ad- 
ditional read-only® data structure, the global answer table, or CAT for short. Its 
contents are identical in format to those in the answer table of the non-modular 
framework. There are however some differences: (1) the CAT contains analysis 



In fact, this data structure is read/write at the global level discussed in Section 6 
below, but it is read-only as regards our engine for analysis of one module. 



244 



German Puebla et al. 



results which were obtained previously to the current analysis step. (2) The GAT 
contains entries which correspond to predicates defined in imports (m), whereas 
all entries in the local answer table (or LAT for short) are for predicates defined 
in m itself. (3) Only information of exported predicates is available in GAT. The 
LAT contains information for all predicates in m regardless of whether they are 
exported or not. 

5.1 Solving the Imported Success Problem 

The second important difference is that the module-aware framework requires 
the use of a success policy, or SP for short, which is represented in Figure 4 with 
a shaded box surrounding the GAT. The SP can be seen as an intermediator 
between the GAT and the analysis engine. The behavior of the analysis engine for 
predicates defined in m remains exactly as before. SP is needed because though 
the information in the GAT will be used in order to obtain answer patterns for 
imported predicates, given a call pattern P : GD it will often be the case that 
an entry of exactly the form P : GD ^ AD does not exist in GAT. In such 
case, the information already present in GAT may be of value in order to obtain 
a (temporary) answer pattern AD. Note that the GAT together with SP will 
allow solving the “imported success problem” . 

In contrast, in many formalizations of non-modular analysis there is no ex- 
plicit success policy. This is because if the call pattern P : GD has not been 
analyzed yet, the analysis algorithm forces its computation. Thus, the results of 
analysis do not depend on any particular success policy: when analysis reaches 
a fixed-point there is always an entry of the form P : GD i— > AD for any call 
pattern P : GD which appears in the analysis graph. Unfortunately, in a mod- 
ular setting it is not directly possible to force the analysis of predicates defined 
in other modules. Those modules may have already been analyzed or they may 
be analyzed in the future. We will simply do what we can given the information 
available in GAT. 

We will use GAT to denote the set of all global answer tables. The success 
policy can be formalized as a function SP : CV x GAT AV. Several success 
policies can be defined which provide over- or under-approximations of the exact 
answer pattern AD~ with different degree of accuracy. Note that this exact value 
ATT is the one which the flattening approach would compute. In this work we 
consider two kinds of success policies, those which are guaranteed to always 
provide over-approximations, i.e. AD~ C SP{P : GD,AT), and those which 
provide under-approximations, i.e., SP{P : GD,AT) C ALT. We will use the 
superscript ■*" (resp “) to indicate that a success policy over-approximates (resp. 
under-approximates). As will be discussed later in the paper, both over- and 
under-approximations are useful in different contexts and for different purposes. 
Since it is always required to know whether a success policy over- or under- 
approximates we will mark all success policies in either way. 

Example 3. A very precise over-approximating success policy is the function 
SPXh defined below, already proposed in [19]: 
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SP^ii(P : CD, GAT) = topmost{CD) AD' where 

app = {AD' \{P: CD' ^ AD') G GAT and CD C CD'} 

The function topmost obtains the topmost answer pattern for a call pattern. The 
notion of topmost description was already introduced in [3] . Informally, a topmost 
description keeps those properties which are downwards closed whereas it loses 
those ones which are not. Note that taking T as answer pattern is a correct over- 
approximation, but often less accurate than using topmost substitutions. For 
example, if a variable is known to be ground in the call pattern, it will continue 
being ground in the answer pattern and taking T as the answer pattern would 
lose this information. However, the fact that a variable is free on call does not 
guarantee that it will keep on being free on success. 

We refer to this success policy as SP^^ because it uses all entries in GAT 
which are applicable to the call pattern in the sense that the call pattern already 
computed is more general than the call being analyzed. 

Example 4- The counter-part of SP^^ is the function SP}}^ which is defined as: 

SPXuiP : CD, GAT) = Uj^jy^^^^Aiy where 

app = [AD' \{P: CD^ ^ AD^) G GAT and CD^ C CD} 

Note the change in the direction of the applicability relation (the call pattern in 
the CAT has to be more particular than the one being analyzed) and the use 
of the lub operator instead of the gib. Also, note that taking, for example, T as 
an under-approximation is correct but SP^^ is more precise. 

5.2 Solving the Imported Calls Problem 

The third important difference w.r.t. the non-modular framework is the use of 
the temporary answer table (or TAT for short) and which is represented as a 
shaded box within the analysis engine of Figure 4. This answer table will be used 
to store call patterns for imported predicates which are not yet present in CAT 
and whose answer pattern has been obtained (approximated) using the success 
policy on the entries currently stored in CAT. The TAT is used as a cache for 
imported call patterns and their corresponding answer patterns, thus avoiding 
having to repeatedly apply the success policy on the CAT for equivalent call 
patterns, which is an expensive operation. Also, after analysis of the current 
module is finished, the existence of the TAT simplifies the way in which the 
global data structures need to be updated. This will be discussed in more detail 
in Section 6 below. 

We use M Analysis jj^{m, Em, SP, CAT) = {LATm, LDTm, TATm) to denote 
that the module-aware analysis framework returns {LATm, LDTm, TATm) when 
applied to module m for initial call patterns Em with SP and CAT. 
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Fig. 5. A two-level framework for analysis of modular programs 



6 Analysis of Modular Programs: the Global Level 

After discussing the local-level issues which appear when analyzing a module, in 
this section we present a complete framework for the analysis of modular pro- 
grams. Since analysis is a global task, an analysis framework should not only deal 
with local-level information, but also with global-level information. A graphical 
representation of our framework is depicted in Figure 5. The main idea is that we 
have to add a higher-level component to the framework which takes care of the 
inter-modular information, as opposed to the intra-modular information which 
is handled by the local-level subsystem described in the previous section. 

As a result, analysis of modular programs is best seen as a two-level process. 
Note that the inner, lightly shaded, rectangle corresponds exactly to Figure 4 
as it is a module-aware analysis system. It is interesting to see how the data 
structures in the global and local levels are indeed very similar. The similari- 
ties and differences between the GAT and LAT have been discussed already in 
Section 5 above. Regarding the global and local dependency tables {GDT and 
LDT respectively), they are used in order to be able to propagate as precisely as 
possible which parts of the analysis graph have to be recomputed. The GDT is 
used in order to add events to the global task queue (GTQ) whereas the LDT is 
used to add events (arcs) to be (re-)analyzed to the local task queue (LTQ). We 
can define the events to be processed at the global level using different levels of 
granularity. As usual, the finer-grained these events are, the more detailed and 
thus more effective the handling of the events can be. One obvious possibility 
is to use modules as events. This means that all call patterns which correspond 
to a module are handled simultaneously whenever the module is selected at 
the global level. A more refined possibility is to keep events at the call pattern 
level. This, together with sufficiently detailed information in the GDT will allow 
increment ality at the call pattern level rather than module level. 
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6.1 Parameters of the Framework 

The framework has three parameters. The program unit corresponds to the pro- 
gram unit to be analyzed. Note that the code may not be physically stored in 
the tool’s memory since it is already on external storage. However, the frame- 
work may maintain some information on the program unit, such as dependen- 
cies among modules, strongly connected components, and any other information 
which may be useful in order to guide analysis. In the figure the program unit is 
represented, as an example, containing a program unit composed of four mod- 
ules. The second parameter is the entry policy, which determines the way in 
which the GTQ and GAT should be initialized whenever analysis of a program 
unit is started. Depending on how the success policy is defined, entries for all 
procedures exported in each of the modules in the program unit may be required 
in GAT and GTQ or not. 

Finally, the scheduling policy determines the order in which the entries in the 
GTQ should be processed. The efficiency with which the fixed-point is reached 
can differ very much from some scheduling policies to others. Since the framework 
presented in Figure 5 has just one analysis engine, processing a call pattern in 
a different module from that currently loaded has a relevant cost associated to 
it, since this often requires context switching from the current module to a new 
module. Thus, it is often a good idea to process all or many of the call patterns 
in GTQ which correspond to the module which is being analyzed in order to 
minimize the number of times the analysis tool has to switch from one module 
to another. In the rest of the paper we consider that events in GTQ are answer 
patterns which would benefit from (re-) analysis. The role of the scheduling policy 
is to select a set of patterns from GTQ which must necessarily belong to the 
same module m to be analyzed. Note that a scheduling policy based on modules 
can always be obtained by simply processing at each analysis step all events in 
GTQ which correspond to m. 



6.2 How the Global Level Works 

As already mentioned, analysis of a modular program starts by initializing 
the global data structures as indicated by the entry policy. At each step, the 
scheduling policy is used to determine the set Em of entries for module m 
which are to be processed. They are removed from GTQ and copied into the 
data structure Entries. The code of the module m is also copied to code. Then, 
MAnalysis{m, Em, SP) = {LATm, LDTm, TATm) is computed. Then, the global 
data structures are updated, as detailed in Section 6.3 below. As a result of this, 
new events may be added to GTQ. Analysis terminates when there are no more 
events to process in GTQ or when the scheduling strategy does not select any 
further events. 

Each entry in GTQ is of one of the following three types: over-approximation, 
under-approximation, or invalid, according to the reason why they should be re- 
analyzed. An entry P : CP i— > AP which is an over-approximation is marked 
P : CP AP. This indicates that the answer pattern AP is possibly an 
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over-approximation since it depends on a call pattern whose answer pattern has 
been determined to be an over-approximation. In other words, the accuracy of 
P : CP AP may be improved by re-analysis. Similarly, under-approximations 
are marked P : CP AP and they indicate that AP is probably an under- 
approximation since it depends on a call pattern whose success pattern has 
increased. As a result, the call pattern should be re-analyzed to guarantee cor- 
rectness. Finally invalid entries are marked P : CP AP. They indicate that 
the relation between the current answer pattern AP and one resulting from re- 
computing it for P : CP is unpredictable. This often indicates that the source 
code of the module has changed in a way that the analysis results for some of 
the exported procedures are just incompatible with previous ones. Handling this 
kind of events is discussed in more detail in Section 6.4 below. 

6.3 Updating the Global State 

In Section 5 it has been presented how the local level subsystem, given a module 
TO, can compute the corresponding LATm, LDTm, and TATm- However, once 
analysis of module to is done, the analysis results of module to have to be used in 
order to update the global state prior to starting analysis of any other module. 

We now briefly discuss how this updating is done. For each initial call pattern 
P : CP in Entries we compare the previous answer pattern AP with the newly 
computed one AP' . If AP = AP' then this call pattern has not been affected by 
the latest analysis. However, it is also possible that the answer pattern “evolves” 
in different analysis iterations. If we use SP^ , the natural thing is that the new 
answer pattern is more specific than the previous one, i.e., AP' C AP. In such 
case those call patterns which depend on P : CP can also improve their success 
pattern. We use the GDT to locate all such patterns and we add them to the 
GTQ with the ^ mark. Conversely, if we use SP~, the natural thing is that 
AP C AP' . We then add events marked “ . 

In a typical situation, and if modules do not change, all events in GTQ will 
be approximations of the same sign. This depends on the success policy used. 
If the success policy is of kind SP'^ (resp. SP~) then the events which will be 
added to GTQ will also be over-approximations (resp. under-approximations). 
In turn, when they are processed they will introduce other over-approximations 
(resp. under-approximations). 

The TATm is also used to update the global state. All entries in TATm are 
added to GAT and GTQ marked with the same sign as the success policy used. 
Last, we also have to update the GDT. For this, we first erase all entries for 
any of the call patterns which we have just analyzed, and which are thus stored 
in entrieSm- Then we add an entry of the form P : CP —>■ H : CP' for each 
imported procedure H which is reachable with call pattern CP' from an initial 
call pattern P : CP. Note that this can easily be determined using LDT. 

6.4 Recovering from an Invalid State 

If code of a module to has changed since it was last analyzed, it can be the 
case that the global information available is invalid. This happens when in the 
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results of re-analysis of m any of the exported predicates has an answer pattern 
which is incompatible with the previous results. In this case, all information 
dependent on the new answer patterns might have become invalid, as discussed 
in Section 6.2. The question is how to minimize the impact of such a situation. 

The simplest solution is to (transitively) erase any information of other mod- 
ules which depends on the invalidated one. This solution may not be very ef- 
ficient, as it ignores all results of previous analyses of other modules even if 
the changes performed in the module are minor, or only affect directly related 
modules. Another alternative is to launch an automatic recovery process as soon 
as invalid analysis results are detected (see [4]). This process has to reanalyze 
the modules directly affected by the invalidated answer pattern(s). If the new 
answer patterns coincide with the old ones then the changes do not affect this 
module and the process terminates. Otherwise, it continues transitively with the 
directly related modules. 

7 Using a Manual Scheduling Policy 

Consider, for example, the relevant case of independent development of different 
parts of the program, which can then even be performed in parallel by differ- 
ent teams. In this setting, it makes sense that the analyzer performs its job on 
the current module without analyzing other modules in the program unit, i.e., 
it allows separate analysis. This will typically allow early detection of compile- 
time errors in the current module without having to wait for the code of the 
dependent modules to be fully developed. Moreover, in this setting, it is the user 
(or users) who decide when and what to analyze. Thus, we refer to this as the 
manual setting. Furthermore, we assume that in this setting analysis for a mod- 
ule TO has to do its best with only the code for to plus the results of previous 
analyses (if any) of the modules in depends{m). These assumptions have im- 
portant implications. The setting allows the users of different modules to decide 
when they should be processed. And thus, any module could be (re-)analyzed 
at any point. As a result, strong requirements must hold for the whole approach 
to be correct. In return, the results obtained may not be optimal (in terms of 
error detection, degree of optimization, etc., depending on the particular tools) 
w.r.t. those achievable using automatic scheduling. 

So the question is, is there any combination of the three parameters of the 
global analysis framework which allows handling the manual setting? The an- 
swer to this question is yes. Our earlier paper [4] essentially describes such an 
instantiation of the analysis framework. In the terminology of the current paper, 
the model in [4] corresponds to waiting until the user requests that a module to 
in the program unit U be analyzed. The success policy is over-approximating. 
This guarantees that in the absence of invalidated entries in the GTQ all events 
will be marked This means that the analysis information available is correct, 
though perhaps not as accurate as possible. Since the scheduling is manual, no 
other analyses should be triggered until the user requires so. Finally, the entry 
policy is simply to include in GTQ an event such as P : T T per predicate 
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exported by any of the modules in U to be analyzed (it is called all entry policy). 
The initial events are required to be so general to keep the overall correctness 
of the analysis while allowing the users to choose the order of the modules to 
be analyzed.^ The model in [4] has the very important feature of being guar- 
anteed to always provide correct results without the need of reaching a global 
fixed-point. 



8 Using an Antomatic Schednling Policy 

In spite of the evident interest of the manual setting, there are situations in which 
the user is interested in obtaining the most accurate analysis results possible. For 
this, it may be required to analyze the modules in the program unit several times 
in order to converge to a distributed global fixed-point. We will refer to this as 
the automatic setting, in which the user decides when to start global analysis 
of a program unit. From then on it is the global analysis framework by means 
of its scheduling policy who decides when and what to analyze. Note that the 
manual and automatic settings roughly correspond to scenario 1 and scenario 2 
of [19] respectively. Since we admit circular dependencies among modules, the 
strategy has to be able to deal with such circularities correctly and efficiently 
without entering infinite loops. The question now is what are the values for the 
different parameters to our generic framework which should be used in order 
to obtain satisfactory results? One major difference of the automatic setting 
w.r.t. the manual setting is that in addition to over-approximations, now also 
under-approximations can be used. This is because though under-approximations 
do not guarantee correctness in general, when an inter-modular fixed-point is 
reached, analysis results are guaranteed to be correct. Below we consider the use 
of 5P+ and SP~ separately. 

8.1 Using Over- Approximating Success Policies 

If a success policy SP'^ is used, we are in a situation similar to the one in Sec- 
tion 7 in that independently of how many times each module has been analyzed, 
if there have not been any code changes, the analysis results are guaranteed to 
be correct. The main difference is that now the system keeps on automatically 
requesting further analysis steps until a fixed-point is reached. 

Regarding the entry policy, an important observation is that in the automatic 
mode, much as in the case of intra-modular analysis, inter-modular analysis 
will eventually compute all call patterns which are needed in order to obtain 
information which is correct w.r.t. calls, i.e., the set of computed call patterns 
covers all possible calls which may occur at run-time for the class of initial calls 
considered, i.e., those for the top-level of the program unit U . This will allow us 
to use a different entry policy from that used in the manual mode: rather than 

^ In the case of the Ciao system it is possible to use entry declarations (see for exam- 
ple [16]) in order to improve the set of initial call patterns for analysis. 
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introducing events of the form P : T T in the GTQ for exported predicates 
in all modules in U , it suffices to introduce them for predicates exported by the 
top-level of U (this entry policy is named top-level entry policy). This has several 
important advantages: (1) It avoids analyzing all predicates for the most general 
call pattern, since this may end up introducing plenty of call patterns which are 
not used in our particular program unit U. (2) It will help to have a more guided 
scheduling policy since there are no requests for processing a module until it is 
certain that a call pattern should be analyzed. (3) If multiple specialization is 
being performed based on the set of call patterns for each procedure (possibly 
proceeded by a minimization step for eliminating useless versions [18]), the fact 
that a call pattern with the most general call pattern exists implies that a non- 
optimized version of the predicate must always exist. Another way out of this 
problem is to eliminate useless call patterns once an inter-modular fixed-point 
has been reached. 

Since reaching a global fixed-point can be a costly task, one interesting pos- 
sibility can be the introduction of a time-out. The user can ask the system to 
request (re-)analysis as needed towards improving the analysis information. How- 
ever, if after performing n analysis steps the time-out is reached before analysis 
n -I- 1 is finished, the global state corresponding to state n is guaranteed to be 
correct. In this case, the entry policy used has to be to introduce most general 
call patterns for all exported predicates, either before starting analysis or when 
a time-out is reached. 



8.2 Using Under- Approximating Success Policies 

Another alternative is to use SP~. As a result, the analysis results are not 
guaranteed to be correct until an inter-modular fixed-point is reached. Thus, 
it may take a large amount of time to perform this global analysis. On the 
other hand, once a fixed-point is reached, the accuracy which will be obtained 
is optimal, since it corresponds to the least analysis graph, which is exactly the 
same which the flattening approach would have obtained. 

Regarding the entry policy, the same discussion as above applies. The only 
difference being that the GTQ should be initialized with events of the form 
P : T 1 -^-“ T since now the framework computes under-approximations. Clearly, 
T is an under-approximation of any description. 

Another important thing to note is that, since the final results of automatic 
analysis are optimal, they do not depend on the use of a particular success policy 
SP^ or another SP 2 . Of course, the efficiency using SP^ can be very different 
from that obtained using SP 2 . 



8.3 Hybrid Policy 

In practice we may wish to use a manual scheduling policy with an over-approxi- 
mating success policy during program development, and then use an automatic 
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scheduling policy with an under-approximating success policy just before pro- 
gram release, so as to ensure that the analysis is as precise as possible, thus 
allowing as much optimization as possible in the final version. 

Fortunately, in such a situation we can often reuse much of the analysis in- 
formation obtained using the over-approximating success policy. The reason is 
that if the analysis with the over-approximating success policy has reached a 
fixed-point, the answers obtained for module m are as accurate as those ob- 
tained with an under-approximating success policy as long as there are no cyclic 
dependencies between the modules in depends{m). Thus in the common case 
that no modules are mutually dependent we can simply use the answer tables 
from the manual scheduling policy and use an automatic scheduling policy with 
an over-approximating success policy to obtain the fixed-point. Even in the case 
that some modules are mutually dependent we can use this technique to com- 
pute the answers for the modules which do not contain cyclic dependencies or 
do not depend on modules that contain them (e.g., leaf-modules). 



8.4 Computation of an Intermodular Fixed-Point 

Determining the optimal order in which the different modules in the program 
unit should be analyzed in order to get to a fixed-point as efficiently as possible 
is not trivial and it is the topic of ongoing work. 

Finding good scheduling strategies for intra-modular analysis is a topic which 
has received considerable attention and highly optimized algorithms exist which 
converge to a fixed-point quickly. Unfortunately, it is not possible to directly 
translate the same heuristics used in the intra-modular case to the inter-modular 
case. In the inter-modular case we have to take into account the time required 
to change from analysis of one module to another since this typically means 
reading a new module from disk. Thus, requests to process call patterns have 
to be grouped by modules in order to reduce the number of times we change 
context. 

Taking the heuristics in [17,10] as a starting point we are investigating and 
experimenting with different scheduling policies which take into account different 
aspects of the structure of the program unit such as dependencies, strongly 
connected components, etc. with promising results. It also remains to be explored 
which of the approaches to success policy results in more efficiently reaching a 
global fixed-point and whether the heuristics to be applied in either case coincide 
or are mostly different. 



9 Some Practical Implementation Issues 

In this section we discuss several issues not addressed in the previous sections 
and which are very important in order to have practical implementations of 
context-sensitive analysis systems. These issues are related to the persistence of 
global information and the analysis of libraries. 
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9.1 Making Global Information Persistent 

The two-level framework presented in Section 6 needs to keep information both at 
the local and global level. One relevant question, due to its practical implications, 
is where this global information actually resides. One possibility is to have the 
global analysis tool running continuously as a kind of “compilation server” which 
stores the global state in its program memory. In a manual setting, this global 
tool would wait for the user(s) to place requests to analyze modules. When a 
request is received, the corresponding module is analyzed for the appropriate call 
patterns and using the global information available at the time in the memory of 
the global analyzer. After analysis terminates, the global information is updated 
and remembered by the process for subsequent requests. If we are in an automatic 
setting, the global tool itself requests the analysis of different modules until a 
global fixed-point (or a time-out) is reached. 

This approach outlined above is not fully persistent in the sense that if the 
computer crashes all information about the global state is lost and analysis 
would have to start from scratch again. In order to implement the more general 
kind of persistence discussed in Section 4, a way to save and restore the global 
state of analysis is needed. This requires storing the value of the three global- 
level data-structures: GAT, GDT, and GTQ. A level of granularity which seems 
appropriate in this context is clearly the module level. I.e., the global state 
of analysis is saved and restored between two consecutive steps of (module) 
analysis, but not during the analysis of a given module, which, from the point 
of view of the two-level framework, is an atomic operation. 

The ability to save and restore the global state of analysis has several advantages: 

1 . The global tool does not need to be running continuously: it can save its state, 
stop, restart when needed, and restore the global state. This is specially 
interesting when using a manual scheduling policy, since two consecutive 
analysis requests can be separated by large intervals. 

2. Even if the automatic scheduling policy is used, any information about the 
global state which is still valid can be directly used. This means that analysis 
can be incremental in the sense that (global level) analysis information which 
is known to be valid is reused. 



9.2 Splitting Global Information 

Consider the analysis of module b in the program unit U = {a, 6, c, d, e, /, g, h} 
depicted in Figure 6. In principle, the global state includes information regard- 
ing exported predicates in any of the modules in U . As a result, if we can save 
the global state to disk and restore it, this would involve storing and retrieving 
information about all modules in U . However, analysis of b only requires retriev- 
ing the information for modules in related(m) . The small boxes which appear on 
the side of every module represent the portion of the global structures related to 
each module. To analyze the module 6, the information of the global tables that 
we need is that of modules a, d and e, as indicated by the dashed curved line. 
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This is straightforward to do in practice by splitting the information in the 
global data structures into several parts, each one associated to a module. This 
allows easily identifying the pieces of global information which are needed in 
order to process a given module. 

This optimization of the handling of global information has several advan- 
tages: 

1. The time required to save and restore the information to disk is reduced 
since the total amount of information transferred is smaller. 

2. The use of the data structures during analysis can be more efficient since 
search space is reduced. 

3. The total amount of memory required in order to analyze a module can 
be significantly reduced: only the local data structures plus a possibly very 
reduced part of the global data structures are actually required to analyze 
the module. 

One question which we have intentionally left open is where the persistent 
information should reside. In fact, all the discussion above is independent on how 
and where the global state is stored, as long as it is persistent. One possibility 
is to use a database which stores the global state and information is grouped 
by modules in order to minimize the amount of information which has to be 
retrieved or updated for each analysis. Another, very common, possibility is to 
store the global information associated to each module to disk, in the same way 
as temporary information (such as relocatable code) is stored in many tradi- 
tional compilers. In fact, the actual implementation of modular analysis in both 
CiaoPP and HAL [14] systems is based on this idea: a module m has a m . reg 
file associated to it which contains the part of the global data structures which 
are associated to m. 



9.3 Handling Libraries and Predefined Modules 

Many compilers and program development systems include a large number of 
predefined modules and libraries which can be readily reused by programmers 
-an obviously interesting feature since it greatly reduces the time required to de- 
velop applications. From the point of view of analysis, these predefined modules 
and libraries differ from user programs in a number of ways: 

1. They are designed with reusability in mind and thus they can be used by a 
comparatively large number of user programs. 

2. Sometimes the source code for libraries and predefined modules may not be 
available. One common reason for this is that they are implemented in a 
lower-level language. 

3. The total amount of code available as libraries can be extremely large. Thus, 
reanalyzing the libraries over and over again for slightly different call patterns 
can be costly. 
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Fig. 6. Using Distributed Scheduling and Local Data Structures 



Given these characteristics, it makes sense to develop a specialized treatment 
for libraries. We propose the following scheme. For each library module, the 
analysis results for a sufficient set of call patterns should be precomputed. This 
set should cover all possible correct call patterns for the library. In addition, 
the answer pattern for those call patterns have to be an over-approximation of 
the actual answers, independently of whether a SP~^ or SP~ success policy is 
used for the programs which use such library. In addition, in order to provide 
more accurate information, more particular call patterns which are expected to 
occur often in programs which use that library module can also be included. 
This information is added to the GAT of the program units which use the 
library. Thus, the success policy will be able to use this information directly 
for obtaining answer patterns. The reason for requiring pre-computed answer 
patterns for library modules to be over-approximations is that, much in the 
same way as for predefined procedures, even if an automatic scheduling policy is 
used, library modules are (in principle) not analyzed for calling patterns other 
than those which are pre-computed. Note that this is conceptually equivalent 
to considering the interface information of library modules read-only, since any 
program using them can read this information, but no additional call patterns 
will be analyzed. As a result, the global level framework will ignore new call 
patterns to library procedures that might be generated during the analysis of 
user programs. More precisely, entries of the form P : CP i— > AP in TAT such 
that P is a library predicate do not need to be added to the GTQ since they will 
not be analyzed. In addition, no entries of the form P ■. CP ^ H : CP' need be 
added to GDT if iL is a library predicate, since the answer pattern for library 
predicates is never modified and thus those dependencies are useless. 

Deciding which is the best set of call patterns for which a library module 
should be analyzed is a non-trivial problem. One possibility can be to extract 
call patterns from correct programs which use the library and study which are 
the call patterns most often used. Another possibility is to have the library 
developer decide which are the call patterns of interest. 
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In spite of the considerations above, it is sometimes the case that we are 
interested in treating a library module using the general scheme, i.e., effectively 
considering the library information writable and allowing the analysis of new call 
patterns and the storage of the corresponding results. This can be interesting if 
the source code of a library is available and the set of initial call patterns for 
which it has been analyzed is not very representative. Note that hopefully this 
will happen often only when the library is relatively new. Once the code of the 
library stabilizes and a good set of initial patterns is obtained, it will generally 
be considered read-only. Allowing reanalysis of a library can also be useful when 
we are interested in using the analysis results from such call patterns to optimize 
the code of the library for the particular cases that correspond to those calls. 
For this case it may be interesting to store the corresponding information locally 
to the calling module, as opposed to inserting it into the library directories. 

In summary, the implementation of the framework needs to treat libraries in 
a special way and also allow applying the general scheme for some designated 
library modules. 



10 Discussion and Conclusions 

Table 1 summarizes some characteristics of the different instantiations of the 
generic framework presented in the paper, in terms of the design features dis- 
cussed in Section 4. The corresponding entries for the flattening approach of 
Section 3 -our baseline as usual- are also provided for comparison, listed in the 
column labeled Flattening. The Manual column lists the characteristics of the 
manual scheduling policy described in Section 7. The last two columns corre- 
spond to the two instantiations of the automatic scheduling policy, which were 
presented in Sections 8.1 and 8.2 respectively. Automatic+ (resp. Automatic') 
indicate that an over-approximating (resp. under-approximating) success policy 
is used. 

The first three rows, i.e.. Scheduling policy. Success policy, and Entry policy 
correspond to the values of these parameters in each instantiation. 

All instances of the framework for modular analysis are module-aware, in 
contrast to Flattening, which is not. Both instances described of the modular 
framework proposed are incremental, in the sense that only a subset (instead of 
every module) in the program unit needs to be re-analyzed, and they also both 
achieve the goal of not needing to reanalyze all call patterns every time a module 
is considered for analysis. 

Regarding correctness, both the Flattening and Automatic” approaches have 
in common that correctness is only guaranteed when analysis comes to an end. 
This is because the approximations used are under-approximations and thus the 
results are only guaranteed to be correct when a (global) fixed-point is reached. 
However, in the Manual and Automatic^ approaches the information in the global 
state is correct after any number of local analysis steps. 

On the other hand, both the Flattening and Automatic” approaches are guar- 
anteed to obtain the most accurate information possible, i.e., the least analysis 
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graph, when a fixed-point is reached. In contrast, the Manual approach cannot 
guarantee optimal accuracy for two reasons. The first one is that there is no 
guarantee that modules will be processed the number of times that is necessary 
for an inter-modular fixed-point to be reached. Second, even if such a fixed-point 
is reached, it may not be the least fixed-point. This is because this approach uses 
over-approximations of the analysis results which are improved ( “narrowed” ) in 
the different analysis iterations until a fixed-point is reached. On the other hand, 
if there are no circular dependencies among predicates in different modules, then 
the fixed-point obtained will be the least one, i.e., the most accurate. 

Regarding efficiency in time we will consider two cases. The first one is when 
we have to perform analysis of the program unit from scratch. In this case. 
Flattening can be highly optimized in order to converge quickly to a fixed-point. 
In contrast, in this situation the instances of the modular framework have the 
disadvantage that loading and unloading modules during analysis introduces a 
significant overhead. As a result, in order to maintain the number of context 
changes low, call patterns may be solicited from imported modules which use 
temporary information and which are not needed in the final analysis graph. 
These call patterns which end up being useless are known as spurious versions. 
This problem also occurs in Flattening, though to a much lesser degree if good 
algorithms are used. Therefore, the modular approaches may end up performing 
work which is speculative, and thus the total amount of work performed in the 
automatic approaches to modular analysis is in principle an upper bound of that 
needed in Flattening. 

On the other hand, consider the second case in which a relatively large 
amount of intra-modular analysis has already taken place for the modules to be 
analyzed in our programming unit and that the global information is persistent. 
In this case, the automatic approaches can update their global data structures 
using the precomputed information, rather than starting from scratch as is done 
in Flattening. In such a case the automatic approaches may perform much less 
work than Flattening. It is to be expected that once module m becomes stable, 
i.e., it is fully developed, it will quickly be analyzed for a relatively large set 
of calling patterns. In such a case it is likely that it will be possible to analyze 
any other module m! which uses m by simply reusing the existing analysis re- 
sults for m. This is specially true in the case of library modules, as discussed in 
Section 9.3. 

Regarding the efficiency in terms of memory, it is to be expected that the 
instances of the modular framework will outperform the non-modular, flatten- 
ing approach. This was in fact already observed in the case of [4]. Indeed, one 
important practical difficulty that appears during the (monolithic) analysis of 
large programs is that the amount of information which is kept in memory is 
very large and the storage needed can become too large to fit in memory. The 
modular framework proposed needs less memory because: a) at each point in 
time, only one module requires to be loaded in the code area, and b) the local 
answer table only needs to hold entries for the module being analyzed, and not 
for other modules. Also, in general, the total amount of memory required to 
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Table 1. Comparison of Approaches to Modular Analysis 





Flattening 




Automatic^ 


Automatic 


Scheduling policy 


automatic 


manual 


automatic 


automatic 


Success policy 


sp- 


SP+ 


SP+ 


sp- 


Entry policy 


top-level 


all 


top-level 


top-level 


Module-awa re 


no 


yes 


yes 


yes 


No Rean. of all CPs 


no 


n/a 


yes 


yes 


Correct 


at fixed-point 


yes 


yes 


at fixed-point 


Accurate 


yes 


no 


no circularities 1 


yes 


Efficient in time 


yes 


n/a 


no 


no 


Efficient in memory 


no 


yes 


yes 


yes 


Termination 


finite asc. chains 


finite asc. chains 


finite chains 


finite asc. chains 



store the global data structures is not very high when compared to the memory 
required locally for the different modules. In addition, not all the global data 
structures are required when analyzing a module m, but only that associated 
with the modules in related{m). 

Finally, regarding termination, except for Flattening, in which only one level 
of termination is required, the three other cases require two levels of termination: 
at the intra-modular and at the inter-modular level. In Flattening, since analysis 
results increase monotonically until a fixed-point is reached, termination is often 
guaranteed by considering description domains which do not contain infinite as- 
cending chains: no matter what the current description is, top (T), which is triv- 
ially guaranteed to be a fixed-point, is only a finite number of steps away. Exactly 
the same condition is required for guaranteeing termination of Automatic”. The 
manual approach only requires guaranteeing intra-modular termination since the 
number of call patterns analyzed is finite. However, in the case Automatic^, finite 
ascending chains are required for ensuring local termination and finite descend- 
ing chains are required for ensuring global termination. As a result, termination 
requires domains with finite chains, or appropriate widening operators. 

In summary, the proposed two-level generic framework for analysis and its 
instantiations meet a good subset of our stated objectives. We hope the dis- 
cussion and the concrete proposal presented in this paper will provide a better 
understanding of the handling of context-sensitive program analysis on modular 
programs and contribute to the widespread use of such context-sensitive analy- 
sis techniques for modular programs in practical systems. An implementation of 
the framework, as a generalization of the pre-existing CiaoPP modular analysis 
components, is currently being completed. In this context, we are experiment- 
ing with different scheduling policies for the global level, for concrete, practical 
analysis situations. 
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Abstract. Formal verification of reactive concurrent systems is impor- 
tant since many hardware and software components of our computing 
environment can be modeled as reactive concurrent systems. Algorith- 
mic techniques for verifying concurrent systems such as model checking 
can be applied to finite state systems only. This chapter investigates 
the verification of a common class of infinite state systems, namely pa- 
rameterized systems. Such systems are parameterized by the number of 
component processes, for example an n-process token ring for any n. 
Verifying the entire infinite family represented by a parameterized sys- 
tem lies beyond the reach of traditional model checking. On the other 
hand, deductive techniques to verify infinite state systems often require 
substantial user guidance. 

The goal of this work is to integrate algorithmic and deductive tech- 
niques for automating proofs of temporal properties of parameterized 
concurrent systems. Here, the parameterized system to be verified and 
the temporal property are encoded together as a logic program. The 
problem of verifying the temporal property is then reduced to the prob- 
lem of determining equivalence of predicates in this logic program. These 
predicate equivalences are established by transforming the program such 
that the semantic equivalence of the predicates can be inferred from the 
structure of their clauses in the transformed program. 

For transforming the predicates, we use the well-established unfold/fold 
transformations of logic programs. Unfolding represents a step of resolu- 
tion and can be used to evaluate the base case and the hnite part of the 
induction step in an induction proof. Folding and other transformations 
represent deductive reasoning and can be used to recognize the induction 
hypothesis. Together these transformations are used to construct induc- 
tion proofs of temporal properties. Strategies are developed to help guide 
the application of the transformation rules. The transformation rules 
and strategies have been implemented to yield an automatic and pro- 
grammable first order theorem prover for parameterized systems. Case 
studies include multi-processor cache coherence protocols and the Java 
Meta-Locking protocol from Sun Microsystems. The program transfor- 
mation based prover has been used to automatically prove various safety 
properties of these protocols. 
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1 Introduction 

Many hardware and software components of our everyday computing environ- 
ment can be modeled as a reactive concurrent system. These include hardware 
controllers, operating systems, network protocols, and distributed applications 
e.g. air traffic control systems. Intuitively, a reactive concurrent system is a col- 
lection of nonterminating processes which run concurrently, and communicate 
with each other as well as an external environment to perform a common task. 
Proving correctness of such a system involves showing that it displays some de- 
sired behavior. Formally proving correctness of such systems has been a topic 
of intense research for the past two decades, leading to the birth of successful 
techniques like model checking [8]. 

Formal verification of reactive systems involves: (i) constructing the “speci- 
fication” i.e. the description of the desired behavior(s) of the program, (ii) con- 
structing the “implementation” i.e. the formal description of the reactive system 
being verified, and (Hi) formally proving that the implementation satisfies the 
specification. Given appropriate formalisms for expressing the specification and 
implementation, we then need a proof system for establishing that a given im- 
plementation satisfies a given specification. Given a proof system and a proof 
obligation {i.e. a given implementation and specification), one needs to con- 
struct a proof tree by repeated application of the rules to the proof obligation. 
In general, this proof tree construction is undecidable [31]. 

However, for finite state concurrent systems, this can be achieved algorith- 
mically by searching the finite model of the implementation, i.e. by searching 
the states of the finite state transition system representing the behaviors of the 
concurrent system. This is the basic idea behind model checking. Model check- 
ing [8] is an automated formal verification technique for proving properties of 
finite state concurrent programs. Here the specification is typically provided as 
a temporal logic formula. The implementation is often expressed using a pro- 
cess calculus, which is translated to a finite state transition system. Verifying 
the truth of the temporal formula is accomplished by traversing the states of 
this transition system based on the structure of the temporal formula. If the 
formula is true, then the search succeeds; otherwise the search fails and yields a 
counterexample . 

The Problem Addressed The applicability of model checking is inherently 
restricted to finite state systems. Many of the verification tasks one would like 
to conduct however deal with infinite state systems. In particular, we often need 
to verify “parameterized” systems such as an n-bit adder or an n-process token 
ring for any n. Intuitively, a parameterized system is an infinite family of finite 
state systems parameterized by a recursively defined type e.g. the set of natural 
numbers N. Thus an n-bit adder is a parameterized system, the parameter in 
question being n G N, the width of the adder circuit. Verification of distributed 
algorithms can be naturally cast as verifying parameterized systems, the parame- 
ter being the number of processes. For example, consider a distributed algorithm 
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where n users share a resource and follow some protocol to ensure mutually ex- 
clusive access. Using model checking, we can verify mutual exclusion only for 
finite instances of the algorithm, i.e. for n = 2, n = 3, . . ., but not for every n. 
The verification of parameterized systems lies beyond the reach of traditional 
model checkers: the representations and the model-checking algorithms that ma- 
nipulate these representations are designed to work on finite state systems and 
it is not at all trivial to adapt them for parameterized system verification. 

In general, automated verification of parameterized systems has been shown 
to be undecidable [3]. Thus, verification of parameterized systems is often ac- 
complished via theorem proving, i.e. mechanically checking the steps of a human 
proof using a deductive system. Even with substantial help from the deductive 
system in dispensing routine parts of the proof, such theorem proving efforts 
require considerable user guidance. Alternatively, one can identify subclasses of 
parameterized systems for which verification is decidable [21, 33]. Using this ap- 
proach meaningful subclasses have been identified, such as token rings of similar 
processes [14] and classes of parameterized synchronous systems [17]. 

The Approach Taken A parameterized system represents an infinite family 
parameterized by a recursively defined type. Therefore, it is natural to attempt 
proving properties of parameterized systems by inducting over this type. In this 
work, we aim to automate the construction of such induction proofs by restricting 
the deductive machinery for constructing proofs. We construct an automatic and 
programmable first order logic prover with limited deductive capability. 

The research reported in this chapter is part of recent efforts to exploit logic 
programming technology for developing new tools and techniques to specify and 
verify concurrent systems. For example, constraint logic programming has been 
used for the analysis and verification of hybrid systems [49] and more recently for 
model checking of finite-state [34] / infinite-state systems [13]. In [40], a memo- 
ized logic programming engine is used to develop XMC, an efficient and flexible 
model checker whose performance is comparable to that of highly optimized 
model checkers such as Spin [22]. Recently, [12] used constraint logic program- 
ming to construct proofs of safety properties of parameterized cache coherence 
protocols. Essentially, these techniques aim to use (constraint) logic program 
evaluation to efficiently construct verification proofs involving state space search 
(accomplished via resolution) and (possibly) constraint solving. These techniques 
do not construct induction proofs and are not applicable to parameterized net- 
works of different topologies; thus, the technique of [12] cannot be used to prove 
properties of parameterized tree networks. On the other hand, we construct 
an automatic and programmable first order logic prover with limited deductive 
capability. This prover is geared to construct nested induction proofs which typ- 
ically proceed by inducting on the structure of the parameterized network. The 
core technology of our prover is provided by logic program transformations. We 
discuss related proof techniques based on program transformations in Section 9. 

Our work provides a methodology for constructing induction proofs by suit- 
ably extending the resolution based evaluation mechanism of logic programs 
[42, 44]. In this approach, the parameterized system and the property to be ver- 
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ified is expressed as a logic program. The verification problem is reduced to the 
problem of determining the equivalence of predicates in this program. The pred- 
icate equivalences are then established by transforming the predicates such that 
their semantic equivalence can be inferred from the syntax of their transformed 
definitions. The proof of semantic equivalence of two transformed predicates p,p' 
then proceeds automatically by a routine induction on the size of the proofs of 
ground instances of p{X) and p'{X). 

For transforming the predicates, we use the well-established unfold/fold trans- 
formations of logic programs [46] which have been previously used for program 
optimization [11, 38] and automated deduction [23, 25, 36]. The major trans- 
formations in such a transformation framework are unfolding, folding and goal 
replacement. One of these transformations (unfolding) represents an application 
of resolution. In particular, an application of the unfold transformation rep- 
resents a single resolution step. Therefore, one can achieve on-the-fly explicit 
state algorithmic model checking by repeated unfolding of the verification proof 
obligation. In constructing induction proofs, unfold transformations are used to 
evaluate away the base case and the finite portions of the proof in the induc- 
tion step of the induction argument. Folding and goal replacement, on the other 
hand, represent a form of deductive reasoning. They are used to simplify the 
given program so that applications of the induction hypothesis in the induction 
proof can be recognized. The reader should note that the folding transforma- 
tion is more powerful that the memoing involved in traditional tabled resolution 
[7, 40, 48]. Folding remembers (disjunctions of) conjunctions of atoms and can 
be used to remember/recognize the induction hypothesis in an induction proof. 

Contributions The contributions of this work can be summarized as follows. 

— First, it shows how logic program evaluation based techniques for verifying 
finite state systems can be flexibly extended to yield a program transfor- 
mation framework for constructing inductive proofs of temporal properties 
of parameterized systems. Since one of our transformations corresponds to 
a model checking step and the others correspond to deductive reasoning, 
model checking emerges as a special case when the deductive steps are ap- 
plied lazily.^ 

— The program transformation framework presented here allows for tight inte- 
gration of algorithmic and deductive verification steps in a proof. Note that 
application of unfolding and folding steps can be arbitrarily interleaved in 
the verification proof of a parameterized system. This constitutes a tighter 
integration of model checking computation with deductive reasoning as com- 
pared to the integration of model checking as a decision procedure in a the- 
orem prover. 

— Finally, we present terminating strategies for controlling the application of 
the transformation rules, thereby leading to the implementation of a pro- 
grammable and fully automatic prover (which is incomplete). These strate- 

® However, unlike model checking, our augmented proof technique does not generate 
counter-example evidence. We discuss this issue further in Section 9. 




Unfold/Fold Transformations for Automated Verification 265 



gies are used to construct safety proofs of parameterized systems of various 
“structures” including uni and bi-directional chains, rings and trees of pro- 
cesses. The prover has been used to construct automated proofs of safety 
properties of cache coherence protocols. It has also been used to automati- 
cally verify mutual exclusion in the Java Meta-Locking Algorithm (a recently 
developed algorithm to ensure secure access of Java objects by multiple Java 
threads) from Sun Microsystems [1]. 

Organization The rest of this chapter is organized as follows. In Section 2 we 
discuss how we encode the problem of verifying temporal properties of parame- 
terized systems as a logic program. Section 3 presents an overview of our proof 
technique, while Section 4 presents the proof rules on which our technique is 
based. Section 5 discusses the automation of each application of a proof rule 
while Section 6 presents a framework to guide application of proof rules when 
several of them are applicable. Section 7 presents an example proof using our 
technique. Section 8 summarizes some applications of our proof technique along 
with experimental results. Finally, section 9 provides concluding remarks and 
comparisons to related work. 



2 Encoding the Verification Problem 

In this section, we discuss how to encode the problem of verifying parameterized 
concurrent systems as a logic program. Intuitively, a parameterized concurrent 
system can be viewed as a network of an unbounded number of finite state 
processes which communicate in a specific pattern. These finite state processes 
constituting the network have a finite number of process types, and their com- 
munication pattern is called the network topology. For example, an n bit shift 
register (for any n) is a parameterized system. It represents an unbounded num- 
ber of finite state processes communicating along a chain. These finite state 
processes are “similar”, each of them representing a single bit. To model a pa- 
rameterized system as a logic program, the local states of the constituent finite 
state processes are represented by terms of finite size. The global state of the 
parameterized system is represented by a term of unbounded size consisting of 
these finite terms as sub-terms. The initial states and the transition relation of 
the parameterized system are then encoded as logic program predicates with 
such unbounded terms as arguments. 



2.1 System Specification 

In our encoding, the global states of the parameterized system to be verified are 
represented by unbounded terms. This is because (i) a global state is a composi- 
tion of local states of the constituent processes and (ii) a parameterized system 
typically contains unbounded number of processes (recall that a parameterized 
system is an infinite family of finite state systems). The initial states and the 
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transition relation of the parameterized system are specified as two logic pro- 
gram predicates gen/1 and trans/2 over these terms. Thus, for any ground 
term t, gen(t) is true iff t is an initial state of one member of the parameter- 
ized family being verified. Similarly for any two ground terms t, we require 
that trans(t, t^) is true iff t ^ t^s a transition in one of the members of the 
parameterized family. The recursive structure of gen and trans depends on the 
topology of the parameterized network being verified. 



gen( [1] ) . 

gen( [0 I X] ) : - gen(X) . 
trans ( [0 , 1 1 T] , [1,0|T]). 
trans([H|T], [H|T1]) trans(T,Tl) 

System description 



Fig. 1. Example: Liveness in an 



thm(X) gen(X) , live(X) . 

live(X) X = [1|_] . 

live(X) trans(X,Y), live(Y). 

Property description 



unbounded length shift register 



For example, in an n bit shift register (for any n), the local states of the bit 
process are represented by the terms 0 and 1 (corresponding to the situations 
where the value stored in the bit is 0 and 1 respectively). A global state of the 
register is then represented by an unbounded list where each element of the list 
is 0 or 1 . Now, let us consider an n bit shift register where initially the rightmost 
bit of the chain contains 1 and all other bits contain 0. The system evolves by 
passing the 1 leftward. A logic program describing the system is given in Figure 1. 
The predicate gen generates the initial states of an n-process chain for all n. As 
mentioned above, a global state of the register is represented as an ordered list 
( a list in Prolog-like notation is of the form [Head I Tail] ) of zeros and ones. 
The set of bindings of variable S upon evaluation of the query gen(S) is { [1] , 
[0,1], [0,0,1], ... }. The predicate trans in the program encodes a single 
transition of the global automaton. The first clause in the definition of trans 
captures the transfer of the 1 from right to left; the second clause recursively 
searches the state representation until the first clause can be applied, (i.e., when 
the 1 is not already in the left-most bit). 



2.2 Temporal Property and Proof Obligations 

So far, we have illustrated how the parameterized system to be verified can be 
encoded as a logic program. The temporal property to be verified can also be 
encoded as a logic program predicate over global states of the system. In this 
chapter, we only consider those properties (p such that Lp (or its negation) can he 
encoded as a definite logic program. This includes weak liveness properties such 
as EFp and invariant properties such as AGp where p is an atomic proposition 
about system states and A, E, F, G are operators of the branching time temporal 
logic CTL [16]. These temporal properties have a single fixed point operator. 
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For our shift register example, we consider the CTL property EF that 
is, eventually the 1 reaches the left most bit. This is encoded by the predicate 
live in Figure 1. The first clause of live succeeds for global states where the 
1 is already in the left-most bit (a good state). The second (recursive) clause of 
live checks if a good state is reachable after a (finite) sequence of transitions. 
Every member of the parameterized family satisfies the liveness property if and 
only if V X gen(X) live(X). Moreover, this is the case if 

VX thm(X) gen(X) 

i.e. if thm and gen are semantically equivalent. Thus, we have encoded the veri- 
fication problem as a logic programs and reduced the proof obligation to estab- 
lishing equivalence of program predicates. 

3 Overview of Our Proof Technique 

We now illustrate how we can construct induction based proofs arising in pa- 
rameterized system verification via logic program transformations. Essentially, 
this is accomplished using the following steps: 

1. Encode the temporal property to be verified as well as the parameterized 
system as a logic program Pq. 

2. Convert the verification proof obligation to predicate equivalence proof obli- 
gations of the form Pq ^ P = T (p, T are predicates) 

3. Construct a transformation sequence Pq, Pi, . . . , Pk s.t. 

(a) Semantics of Pq = Semantics of Pk 

(b) from the syntax of Pk we infer h p = q 

The construction of a transformation sequence proceeds by repeated applica- 
tion of transformation rules. If several rules are applicable, strategies are used for 
rule selection. Inferring P^ h p = q from syntax is achieved by a sufficient con- 
dition called Syntactic Equivalence which is formally defined in Section 5.1 (see 
Definition 4). Also, note that we are dealing with definite logic programs, and 
the semantics that we consider for a definite logic program is its least Herbrand 
model [10]. 

In the shift register example, we have encoded the problem of verifying live- 
ness in an n bit shift register as the logic program Pq in Figure 1. We have 
reduced the verification proof obligation to establishing the equivalence of thm 
and gen predicates in program Pq. We then apply program transformations to 
program Pq to obtain a program Pk where thm and gen are defined as follows: 

In this example, the transformed definitions of thm and gen are identical 
modulo predicate (and variable) renaming. In general, we have a sufficient con- 
dition called syntactic equivalence s.t. if two predicates p and q are syntactically 
equivalent in program Pk then p and q are semantically equivalent in Pk- Fur- 
thermore, we ensure that checking syntactic equivalence of two predicates in a 




268 



Abliik Roychoudhury and C.R. Ramakrishnan 



gen([l]). thm([l]). 

gen([0|X]) gen(X) . thm([0|X]) thm(X) . 



Fig. 2. Fragment of Transformed Program for Shift Register Example 



given program is decidable. In the shift register example, the transformed defi- 
nitions of gen and thm given in Figure 2 are syntactically equivalent. The formal 
definition of syntactic equivalence is presented in Section 5.1. The definitions of 
gen and thm given above both represent the infinite set {[0”, 1] \ n G N}. For 
each element X in this set, we can therefore construct a ground proof of thm(X) 
and gen(X). Formally, we define a ground proof as: 

Definition 1 (Ground Proof) Let T be a tree, each of whose nodes is labeled 
with a ground atom. Then T is a ground proof in a definite program P, if every 
node A in T satisfies the condition : A Ai,...,A„ is a ground instance of a 
clause in P, where A\, ..., (n > 0) are the children of A in T. 

For example, a ground proof tree^ of gen( [0,0,1]) and thm( [0,0,1]) (using 
the above clauses of thm and gen) are shown below. 

gen([0,0,l]) thm([0,0,l]) 

gen([0,l]) thm([0,l]) 

gen( [1] ) thm( [1] ) 




Inferring the equivalence of thm and gen from the transformed definitions in 
Figure 2 involves an induction on the size of the proof trees of gen(X) and thm(X) 
for any ground term X. In general, to prove the equivalence of two predicates 
p, p' of same arity we first transform their definitions to syntactically equivalent 
forms. Then, the proof of semantic equivalence of two syntactically equivalent 
predicates p,p' proceeds (by definition of syntactic equivalence) as follows: 

— show that for every ground proof of p{X)9 (where X are variables and 9 is 
any ground substitution of X) there exists a ground proof of p'{X)9. This 
follows by induction on the size of ground proofs of p{X)9. 

— show that for every ground proof of p'{X)9 (where X are variables and 9 is 
any ground substitution of X) there exists a ground proof of p{X)9. This 
follows by induction on the size of ground proofs of p'{X)9. 

Thus, transforming gen and thm to obtain the definitions of Figure 2 and then 
inferring the equivalence from these transformed definitions amounts to an in- 
duction proof of the liveness property. Note that even though we are actually 

In this particular example, these are the only ground proofs of gen ([0,0,1]) and 
thm([0,0,l]). 
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inducting on the size of ground proofs, here this is same as inducting on the 
process structure of the parameterized system: the length of the shift register. 

We now formally describe our proof technique. Since we always prove equiv- 
alence of logic program predicates, we start by constructing a proof system for 
predicate equivalence proof obligations. Formally, the predicate equivalence prob- 
lem is: given a definite logic program P and a pair of predicates p and p' of the 
same arity, determine if P \= p = p' i.e. whether p and p' are semantically 
equivalent in P. In other words, we need to determine whether for all ground 
substitutions 6, p{X)6 S M{P) p'{X)9 G M{P). Here M{P) denotes the 
least Herbrand model [30] of program P. Henceforth, whenever we refer to the 
“semantics” of a definite logic program P, we mean its least Herbrand model 
M{P). 

4 A Proof System for Predicate Equivalences 

We develop a tableau-based proof system for establishing predicate equivalence. 
The proof system presented here can be straightforwardly extended to prove 
goal equivalences® instead of predicate equivalences. Our process is analogous to 
SLD resolution. Recall that given a goal Q and a definite logic program P, SLD 
resolution is used to prove whether instances of Q are in M{P). This proof is 
constructed recursively by deriving new goals via resolution. The truth of Q is 
then shown by establishing the truth of these new goals. In contrast, each node in 
our proof tree denotes a pair of predicates (p, p') . To establish their equivalence 
we must establish that the predicates in the pair represented by each child node 
are equivalent. Note that the predicates in the child node are to be obtained 
from the syntax of the current definitions of p,p' ■ We now define: 

Definition 2 (e-atom) Let P = Pq, Pi, Pi be a sequence of programs. An 
e-atom is of the form P \- p = p' where p and p' are predicates of same arity 
appearing in each of the programs in P. It represents the proof obligation 

^0 < j < i Pj \= P = p' 

Thus, an e-atom P \- p = p' represents the proof obligation that p,p' are 
semantically equivalent in each of the programs in P. We generalize the problem 
of establishing a single e-atom to that of establishing a sequence of e-atoms. We 
define an e-goal as a (possibly empty) sequence of e-atoms. We will often denote 
an e-goal by £, possibly with primes and subscripts. Recall that SLD resolution 
proves a goal by unfolding an atom in the goal. Similarly, we proceed to prove 
an e-goal by transforming the relevant clauses of an e-atom {i.e. the clauses of 
the predicates appearing in the e-atom) in the e-goal. 

The three rules used to construct an equivalence tableau are shown in Table 1. 
In the description of the proof rules P denotes a sequence of programs Pq,. . . ,Pi. 
Given a definite logic program Pq, and a pair of predicates of same arity p,p', we 
construct a tableau for the proof obligation Pg \- p = p' hy repeatedly applying 
the inference rules in Table 1. 

® Recall that in a definite logic program, a goal is a conjunction of atoms. 




270 



Abhik Roychoudhury and C.R. Ramakrishnan 



Name 


Top-down Inference (one step) 


Side Condition 


(Ax) 




T, rhp = p\ £' 
£, £' 




Pi 

P = P' 


(Tx) 




£, Php=p\ £' 

T, r,Pi+^hp = p', £' 




M{Pi+i) = M{Pi) 


(Gen) 




T 

III 




Po\=q = q' 

^ M{Pi+i) = M{Pi) 




r,Pi+i hp =p', Pohq^q', 


£' 



Table 1. Proof System for Showing Predicate Equivalences 



The axiom elimination rule (Ax) is applicable whenever the equivalence 
of the predicates p and p' can be established by some automatic mechanism, 

Pi Pi ^ 

denoted in the rule by p = p'. Thus, = is a decision procedure which infers 
the equivalence of p, p' in program Pi . Axiom elimination will typically be an 
application of what we call syntactic equivalence, a decidable equivalence of 
predicates based on the syntactic form of the clauses defining them. 

The program transformation rule (Tx) attempts to simplify the program 
in order to expose the equivalence of predicates (which can then be inferred 
via an application of Ax). The program is constructed from P using a 
semantics preserving program transformation. We use this rule whenever we 
apply an unfolding, folding, or any other (semantics-preserving) transformation 
that does not add any equivalence proof obligations. We give a brief presentation 
of these transformation rules in the next section. 

The equivalence generation rule (Gen) proves an e-atom P \- p = p' hy per- 
forming replacements in the clauses of p,p'. In particular, occurrence of some 
predicate q in the clauses of p, p' is replaced by occurrence of another predicate 
q' . The guarantee is that if the predicates q, q' are semantically equivalent then 
the program thus obtained is semantics preserving. This appears as the side con- 
dition of the Gen rule. The notation Pq \= q = q' is a shorthand for the following: 
for all ground substitution 9, q(X)9) € M(Pq) q'(X)9) € M(Pq) where M(Pq) 
is the least Her brand model of Pq- Note the proof of semantic equivalence of p 
and p' is being constructed by using the semantic equivalence of q and q' . This 
allows us to simulate nested induction proofs. Typically, an application of the 
Gen rule corresponds to applying the goal replacement transformation.® 

The notion of a tableau for a predicate equivalence proof obligation in a 
definite logic program Pq is then defined in the usual way. 



The Gen rule does not require {p,p'} H {q,q'} = . When we synthesize an algo- 
rithmic framework for applying the proof rules we will keep track of the past history 
of equivalence proof obligations. 
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Definition 3 (Equivalence Tableau) An equivalence tableau of an e-goal Sq 
is a finite sequence of e-goals Sq,Si, . . . ,Sn where £i+\ is obtained from £i by 
applying one of the rules described in Table 1 and is empty. 

Now, let Po be a definite logic programs and p, p' be predicates of same 
arity appearing in Pq. Then we use our proof system to construct an equivalence 
tableau of £q = (Pq ^ P = p') ■ 

Theorem 1 (Soundness of Proof System) Let £q,£i, . . . ,£n be a successful 
tableau with £q = {Pq \- p = p') for some definite logic program Pq. Then Pq |= 
p = p' i.e. predicates p and p' are semantically equivalent in the least Herbrand 
model of Pq. 

Proof: We prove a stronger result. For any successful tableau of an e-goal Sq if 
P h p = p' is an e-atom in £q where P = Pq, . . . , Pi then Pi \= p = p' . 

The proof for this result is established by induction on the length of the 
tableau. For the base case, we have a tableau of length 1, which is formed by 
an application of the Ax rule. For such a tableau the result holds trivially since 
Ax is applied only when the semantic equivalence of p, p' can be automatically 
inferred in Pi. For the induction step, we consider a tableau £q,£\,. . . ,£k+i of 
length fc -I- 1. For all e-atoms of Sq which are not modified in the step £q Si, 
the result follows by induction hypothesis. Let Pq, . . . , Pi h p = p' be the e-atom 
in £q that is modified. 

— Ax: If the rule applied to £q is Ax, then from the side condition of Ax we 
have Pi\= p = p' . 

— Tx : If the rule applied to £q is Tx, then Pq, . . . ,Pi, Pi+i h p = p' is an 
e-atom in £i. Since £\,. . . ,Sk+i is a successful tableau of £i, therefore by 
induction hypothesis Pi+i \= p = p' . By the side condition of Tx, we have 
M{Pi) = M(Pi+i) and therefore Pi |= p = p'. 

— Gen : If the rule applied to £q is Gen, then Pq,. . . ,Pi, Pi+i h p = p' and 
Pq \- q = q' are e-atoms in Si. Again £i, . . . ,£k+i is a successful tableau 
of £i. By induction hypothesis, we have Pi+i \= p = p' and Pq \= q = q' . 
From the side condition of Gen we have M{Pi) = M(Pi+i) and therefore 
P, \=p = p'. 

li £q, ... ,£n is a successful tableau of £q = Pq b P = p' then Pq \= p = p'. □ 

The tableau can be readily extended to use some transformations that may 
not preserve least models, but only ensure that the least models, with respect 
to the predicates in the original program, are same. A transformation that adds 
new predicates to a program has this property, and is often useful in predicate 
equivalence proofs. From the soundness of the proof system, we can also infer the 
following property for any equivalence tableau: for any e-atom P h . . . appearing 
in an equivalence tableau, all programs in P are semantically equivalent. The 
proof appears in [41]. 
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Lemma 2 Let So, , Sn be an equivalenee tableau of Sq = Pq \~ p = p' . 

For every e-atom {P h . . .) in the tableau, if P = Po,...,Pi then we have 
M{Po) = ... = M{P,). 

Note that the proof system given in Table 1 is not complete. There can be 
no such complete proof system as attested to by the following theorem. 

Theorem 3 (Incompleteness) Determining equivalence of predicates de- 
scribed by logic programs is not recursively enumerable. 

The theorem is easily proved using a reduction described in [3] . For a Turing 
machine M, we construct a program having two predicates, one that describes 
the natural numbers and the other that identifies an n such that M does not 
halt within n moves. These predicates are equivalent if and only if M does not 
halt. The non-halting problem is not recursively enumerable and so the predicate 
equivalence problem cannot be recursively enumerable. 

5 Automated Instances of Proof Rules 

In this section, we discuss the automation of each application of an Ax, Tx or 
Gen rule. The application of the Tx and Gen rules is achieved by unfolding, 
folding and goal replacement transformations (which we also discuss). 

5.1 Automating the Ax Rule 

The axiom elimination rule (Ax) infers the equivalence of two predicates p,p' 
in a semantics preserving program transformation sequence P = Pq, . . . , Pi. In 
the light of Theorem 3, any such rule will be incomplete. Therefore, we will con- 
struct an effectively checkable sufficient condition for predicate equivalence. We 
call this sufficient condition as syntactic equivalence. Given a program transfor- 
mation sequence P = Pq, ..., Pi and two predicates p,p', we apply Ax if p,p' 
are syntactically equivalent in program Pi. 

As an illustration, consider the program P (with clauses annotated with 
integer clause measures) in Figure 3. We can infer that P |= r = s since r and 
s have identical definitions. Using the equivalence of r and s we can infer that 
P 1= p = q, since the definitions of p and q are, in a sense, isomorphic. 



p(X) : 


- r(X). 


q(X) 


s(X). 


p(X) : 


- e(X,Y), p(Y). 


q(X) 


e(X,Y), q(Y) 


r(X) : 


- b(X). 


s(X) 


b(X). 



Fig. 3. Program with syntactically equivalent predicates. 



We formalize this notion of equivalence in the following definition. The fol- 
lowing definition partitions the predicate symbols of a program into equivalence 
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classes. Each predicate is assumed to be assigned a label, the partition number of 
the equivalence class to which it belongs. The labels of all predicates belonging 
to the same equivalence class is thus the same, and each equivalence class has a 
unique label. 

p 

Definition 4 (Syntactic Equivalence) A syntactic equivalence relation is 
an equivalence relation on the set of predicates of a program P such that for all 

p 

predicates p,q in P, if p ^ q then the following conditions hold: 

1. p and q have same arity, and 

2. Let the clauses defining p and q be {Ci, . . . , Cm} and {Di , . . . , £>„} respec- 
tively. Let {C [, . . . , C'm} and {D }, . . . , D'^} be such that C[ (D[) is obtained 
by replacing every predicate symbol r in Ci (Di) by s, where s is the la- 

p 

bel of the equivalence class of r (w.r.t. Then there exist two functions 
f : m} {1, . . . , n} and g : {1, . . . , n} ^ {1, . . . , m} such that 

(i) \/l < i < m C[ is an instance of 

(ii) VI < j < n D} is an instance of 

Note that there is a largest syntactic equivalence relation. It can be computed 
by starting with all predicates in the same class, and repeatedly splitting the 
classes that violate properties (1) and (2) until a fixed point is reached. The 
existence of the mapping / ensures that for any ground substitution 6 we have 
p{X)6 e M{P) q{X)9 e M{P) whereas the mapping g ensures q{X)0 G 
M{P) p{X)0 G M{P). The proof of the lemma proceeds by induction on size 
of ground proofs (see [41] for details). 

p 

Lemma 4 (Syntactic Equivalence Semantic Equivalence) Let ~ be 

the syntactic equivalence relation of the predicates of a program P. For all pred- 

p 

icates p, q, if p ^ q, then P \= p = q. 

5.2 Automating Tx: Transformations as Proof Rules 

The transformation rule Tx corresponds to applying a program transformation 
which does not add any new equivalence proof obligations. Typically an applica- 
tion of this step is either unfolding or folding, or other standard transformations 
like generalization and equality introduction, deletion of subsumed clauses and 
deletion of failed clauses [35]. A single application of all these transformations 
can be automated. 

We transform a logic program to another logic program by applying trans- 
formations that include unfolding and folding. A simple illustration of these 
transformations appears in Figure 4. Program P\ is obtained from Pq by unfold- 
ing the occurrence of q(X) in the definition of p. P 2 is obtained by folding q(X) in 
the second clause of p in Pi using the definition of p in Pq (an earlier program). 
Intuitively, unfolding is a step of clause resolution whereas folding replaces an in- 
stance of a clause body (in some earlier program in the transformation sequence) 
with its head. 
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p(X) q(X) 



q(0). 

q(s(X)) q(X). 



Program Pq 



p(0) . 




p(0) . 




p(s(X)) 


q(X) 


Fold p(s(X)) 


p(X) 


q(0). 




q(0) . 




q(s(X)) 


q(X). 


q(s(X)) 


q(X) 


Program Pi 


Program 


P 2 



Fig. 4. Illustration of unfold/fold transformations 



In the following, we present a (simplified) version of the unfolding and folding 
transformation rules. Note that each application of these rules is automated. We 
say that Po> ^i) • ■ • 7 is an unfold/fold transformation sequence if the program 
Pi+i is obtained from Pi (i > 0) by an application of unfolding or a folding. 
We always assume that the clauses of each program in such a transformation 
sequence are “standardized apart” . In other words, the variables of the clauses of 
Pi are suitably renamed such that no two clauses have any variables in common. 

Transformation 1 Unfolding Let C he a clause in Pi and A an atom in the 
body of C . Let C\, . . . ,Cm be the clauses in Pi whose heads are unifiahle with 
A with most general unifier cti, . . . ,am- Let C' he the clause that is obtained by 
replacing Aaj by the body of Cjaj in Cuj (1 < j < m). Assign {Pi — {C}) U 
{C[,...,C'^}toP.+i. □ 



Transformation 2 Folding Let {Ci, . . . , Cm} C Pi where Ci denotes the clause 

A Api , . . . , Apm ,A[,.--, A!^ 

and {Di , . . . , Dm} Q Pj (j < i) where Di is Bi Bpi , . . . , Bpm ■ Further, let: 

1. VI < Z < m 3 ( 7 ; VI < Zc < n; Apk = Bp^eri 

2. Biai = B2(J2 = • • • = BmUm = B 

3. Di, . . . , Dm are the only clauses in Pj whose heads are unifiable with B. 

4-. yi < I < m, ai substitutes the internal variables^ of Di to distinct variables 
which do not appear in {A, B,A[,... A'^}. 

Then Pi+\ := {Pi — {Ci, , Cm}) U }C'} where C = A B, A }, . . . , A'^ □ 

Ui, . . . , Dm are the folder clauses, C\, . . . , Cm are the folded clauses, and B 
is the folder atom. 

Semantics preservation While unfolding is semantics preserving, folding may 
introduce circularity and change the program semantics. Recall that we are deal- 
ing with definite logic programs and we consider the least Herbrand model se- 
mantics. For example consider the program Pi in Figure 5; P\ is obtained from 
Pq by unfolding the occurrence of q(X) in the body of p’s second clause. We 

^ Variables appearing in the body of a clause, but not its head 
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perform folding where the second clause of p in Pi serves as the folded clause 
and the second clause of q in Pq serves as the folder clause. We get the program 
P 2 of Figure 5. Now, let us fold again. We use the second clause of q in P 2 as 
the folded clause and the second clause of p in Pi as the folder clause. This 
produces the program P 3 of Figure 5. The program transformation sequence 
Pq ^ Pi ^ P2 ^ P3 is not semantics preserving since the least Herbrand model 
of P 3 differs from that of Pq. 



p(X) :-q(X) . 
q(a) . 

q(f(X)):-q(X). 



p(a) . 

p(f (X)) :-q(X) . 
q(a) . 

q(f (X)) :-q(X) . 



p(a) . 

p(f(X)):-q(f(X)) 
q(a) . 

q(f(X)):-q(X). 



p(a) . 

p(f(X)):-q(f(X)). 
q(a) . 

q(f(X)):-p(f(X)). 



Program Pq Program Pi 



Program P 2 



Program P 3 



Fig. 5. An example of incorrect unfold/fold transformation sequence 



Due to this problem of semantics preservation, existing unfold/fold transfor- 
mation systems have restricted the folding rule. Thus, in a program transforma- 
tion sequence Pq, Pi,. . . ,Pi, folding of clause(s) in Pi is restricted [20, 26, 46, 47]. 
The restrictions are of two kinds: (a) based on the unfold/fold steps used to de- 
rive the transformation sequence Pq, . . . , Pi, and (b) based on the syntax of the 
folder clauses used. In [43] we have shown that restrictions on the syntax of 
folder clauses is unnecessary for semantics preservation. As a consequence of 
this result, in a folding step we can use multiple clauses as folder; furthermore 
some of these clauses may be recursive. 

The additional power of our transformation rules is useful in our transforma- 
tion based proofs of temporal properties. Note that temporal properties contain 
fixed point operators. These properties are typically encoded as a logic pro- 
gram predicate with multiple recursive clauses e.g. a least fixed point property 
containing disjunctions is encoded using multiple recursive clauses. A simple 
reachability property EFp (which specifies that a state in which proposition p 
holds is reachable) [9] will be encoded as a logic program as follows: 

ef(X) p(X). 

ef(X) trans(X,Y), ef(Y). 

where the predicate trans captures the transition relation of the system being 
verified, and p(X) is true if the proposition p holds in state X. This encoding 
contains two clauses, one of which is recursive. 



5.3 Automating the Gen Rule 

The Gen rule attempts to prove the e-atom P \- p = p' hy proving the e- 
atoms P, Pi+i \- p = p' and Pq \- q = q' where P = Pq, ..., Pi is a, program 
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transformation sequence. It generates a new lemma Pq \~ q = q' whose proof is 
used to ensure that M{Pi) = M{Pi+i). An application of Gen corresponds to 
an application of the Goal Replacement transformation (given in the following) . 
Here, we replace an occurrence of q with q' in a clause of p or p' as shown below. 

c : pC) g, q(s), g'. c : pQ g, q'is), g' . 

Program Pi Program Pi+i 

This requires us to show Pq \= q = q' and therefore we obtain a new proof 
obligation Pq \- q = q' . We prove Pq \~ q = q' by constructing a different 

P' 

transformation sequence Pq, P{, . . . , P^. s.t. q ^ q' i.e. q,q' are syntactically 
equivalent in Note that since we are replacing q with q' in program Pi, the 
goal replacement rule requires Pi \= q = q' . However for any e-atom P \- . . . 
appearing in a successful tableau, M{Pq) = ... = M{Pi) where P = Pq, ■ ■ ■ , Pi 
(refer Lemma 2). Thus, Pq \= q = q' implies Pi \= q = q'. 

A (simplified) definition of the Goal Replacement Transformation is given 
below. Again, to ensure semantics preservation, the transformation rule needs 
to impose additional restrictions on the transformation sequence Pg, Pi, . . . , Pi. 
We omit these restrictions here (refer [43] for details). For a conjunction of atoms 
Ai, ..., An, we use the notation vars(Ai, ..., An) to denote the set of variables in 

A±, ..., An- 

Transformation 3 Goal Replacement Let G be a clause A Ai, . . . , Ak,G 
in Pi, and G' be an atom such that vars{G) = vars{G') C vars{A, Ai, ..., Ak). 
Suppose for all ground instantiation 9 of G,G' we have G6 G M{Pi) G'9 G 
M{Pi). Then P,+i := (P, - {G}) U {G'} where C' = A Ai, ... , Ak, G' . □ 

6 An Algorithmic Framework for Proof Strategies 

We describe an algorithmic framework for creating strategies to automate the 
construction of the equivalence tableau of an e-atom. The objective is to: (a) find 
equivalence proofs that arise in verification with little or no user intervention, 
and (b) apply deduction rules lazily, i.e. for finite state systems a proof using 
the strategy is equivalent to algorithmic verification. 

Our framework specifies the order in which the different program transfor- 
mations (corresponding to each tableau rule) will be applied. If multiple trans- 
formations of the same kind (e.g., two folding steps) are possible at any point 
in the proof, the framework itself does not specify which transformations to ap- 
ply. That is done by a separate selection function (analogous to literal selection 
in SLD resolution). Thus we only present a framework for constructing strate- 
gies, rather than concrete strategies. Concrete strategies can he constructed by 
instantiating this framework. 

The tableau rules and associated transformations are applied in the following 
order. As would be expected, the axiom elimination rule (Ax) is used whenever 
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pl(a). pl(a). 



pl(f(X)):- pl(X),sl(X). 




pl(f(X)):- pl(X), 


sl(X). 




pl(f(X)):- pl(X),tl(X), 


ql(X) 


pl(f(X)):- pl(X)J 


tl(X), 


q2(X) 



rl(X):- sl(X). 

rl(X):- tl(X),q2(X). rl(X):- tl(X),q2(X). 

Po Pi 



pl(a) . 

pl(f(X)):- pl(X),rl(X). 
rl(X) sl(X) . 
rl(X):- tl(X),q2(X). 

Pi + l 



Fig. 6. Goal replacements to facilitate other transformations. 



it is applicable. When the choice is between the Tx and Gen rules, we choose 
the former since the default transformation employed by Tx is unfolding, i.e. 
resolution. This will ensure that our strategies will perform on-the-fly model 
checking, a’ la XMC [40] for finite-state systems. To create finite unfolding se- 
quences we impose a finiteness condition FIN on transformation sequences. We 
do not give an exact definition of FIN but only a sufficient condition such that 
the resultant unfolding sequences terminate. 

Definition 5 (Finiteness condition) Given an a-priori fixed constant k gN, 
an unfolding program transformation sequence F = Pq, Pi, .. . satisfies the 
finiteness condition FIN{F,k) if for the clause C and atom A selected for un- 
folding at every Pi: (1) A is distinct modulo variable renaming from any atom B 
which was selected in unfolding some clause D G Pj{j < i) where C is obtained 
by repeated unfolding of D (2) the term depth of each argument of A is < k. 

Typically, we will assume a suitable choice of k and write the finiteness con- 
dition simply as FIN(F). Condition 1 prohibits infinite unfolding sequences of 
the form: unfolding p(X) using the clause p(X) p(X) i.e. unfolding sequences 
where the same atom is infinitely unfolded. Condition 2 prohibits infinite un- 
folding sequences of the form: unfolding p (X) using the clause p(X) :- p(s(X)) 
i.e. where a different atom is unfolded every time, but there are infinitely many 
atoms to unfold. 

We note that various online techniques for ensuring termination of unfolding 
sequences have been studied in the context of partial deduction [28, 32]. These 
techniques proceed by establishing a well-founded / well-quasi order among the 
atoms unfolded. This order may be fixed before hand, or refined online as the 
unfolding proceeds. Such techniques could be adapted for controlling unfolding 
in our predicate equivalence prover. 

If FIN prohibits any further unfolding we either apply the folding transfor- 
mation associated with Tx or use the Gen rule. Care must be taken, however, 
when Gen is chosen. Recall from the definition of Gen (refer Table 1) that 
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r, Pi+i \- p = p' implies F \- p = p' only if we can prove a new equivalence 
Pq h q = g'. In other words, Pi+\ \= p = p' implies Pi \= p = p' only if 
Po \= q = q'. Since Gen itself does not specify the goals q and q' in the new 
equivalence, its application is highly nondeterministic. We limit the nondeter- 
minism by using Gen only to enable Ax or Tx rules. For instance, consider the 
transformation sequence in Figure 6 (the intermediate programs Pi, ... , Pi-i in 
the program transformation sequence Pq, Pi, . . . , Pi-i, Pi are not shown). Ap- 
plying goal replacement in Pq under the assumption that Pq |= ql = q2 enables 
the subsequent folding which transforms Pi into Pi+i. 

Thus, when no further unfoldings are possible, we apply any possible folding. 
If no foldings are enabled, we check if there are new goal equivalences that will 
enable a folding step. We call this a conditional folding step. For instance, in 
program Pq of Figure 6, equivalence of ql(X) and q2(X) enables folding. Note 
that the test for syntactic equivalence is only done on predicates, whereas a goal 
is a conjunction of atoms. However, we can reduce a goal equivalence check to a 
predicate equivalence check by introducing new predicate names for the goals. A 
keen point needs to be noted here. When we introduce new predicate names to a 
program, clearly the least Herbrand model cannot be preserved. As is common 
in program transformation literature [46, 20], we rectify this apparent anomaly 
by assuming that all new predicate names introduced are present in the initial 
program Pq of a program transformation sequence. 

Finally, we look for new goal equivalences, which, if valid, can lead to syn- 
tactic equivalence. This is called a conditional equivalence step. For instance, 
suppose in program P^+i (in Figure 6), there are two additional predicates p2 
and r2 and further assume that p2 is defined using clauses 

p2(a) . 

p2(f (Y)) p2(Y) , r2(Y) . 

Now if r2 and rl are semantically equivalent, we can perform this goal replace- 
ment to obtain a program P ^+2 where pi and p2 are defined as follows. Thus, in 

Pi +2 we can conclude that pi p2. 

pi (a) . p2(a) . 

pKf (X)) pl(X), rl(X). p2(f (Y)) p2(Y), rl(Y). 

The above intuitions are formalized in algorithmic framework Prove (see 
Figure 7). Given a program transformation sequence P, and a pair of predicates 
p,p' , Prove attempts to prove that P \- p = p' . Prove searches nondeterminis- 
tically for a proof: if multiple cases of the nondeterministic choice are enabled, 
then they will be tried in the order specified in Prove . If none of the cases apply, 
then evaluation fails, and backtracks to the most recent unexplored case. There 
may also be nondeterminism within a case; for instance, many fold transforma- 
tions may be applicable at the same time. We again select nondeterministically 
from this set of applicable transformations. By providing selection functions to 
pick from these applicable transformations, one can implement a variety of con- 
crete strategies. Note that Prove uses two different markings in the process of 
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constructing a proof for F \- p = p' . The marking proved remembers predicate 
equivalences which have been already proved. This marking allows us to cache 
subproofs in a proof. The marking proof-attempt keeps track of predicate pairs 
whose equivalence has not yet been established, but is being attempted by Prove 
via transformations. This marking is essential for ensuring termination of Prove. 
The proof of Pq \~ p = p' may (via a conditional equivalence step) generate the 
(sub)-equivalence Pq\- p = p'. Prove deems this proof path as failed and explores 
other proof paths. 

Our algorithmic framework Prove uses the following functions. Functions 
unfold{P), fold{P) apply unfolding and folding transformations respectively to 
program P and return a new program. Whenever conditional folding is possible, 
the function new-goaLequiv-for_fold{P) finds a pair of goals whose replacement is 
necessary to do a fold transformation. Similarly, when conditional equivalence is 
possible, new-goaLequiv-for_equiv{p, p', P) finds a pair of goals Q, Q' s.t. syntactic 
equivalence of p and p' can be established after replacing Q with Q' in P. 

Finally, replace-and-prove constructs nested proofs for sub-equivalences cre- 
ated by applying the Gen rule. Thus, replace-and-prove{p,p' ,Q ,Q' , P) performs 
the following sequence of steps (where P = Pq, Pi): 

1. first introduces new predicate definitions q and q' for goals Q and Q' respec- 
tively (if such definitions do not already exist), 

2. proves the equivalence Pq\- q = q' hy invoking Prove, 

3. replaces goal Q by goal Q' in clauses of p or p' in program Pi to obtain 
program Pi+i, and 

4. finally invokes Prove to dispense the obligation T, \- p = p'. This com- 
pletes the proof of P \~ p = p' . 

Termination of Prove It can be verified that only finite unfolding sequences 
satisfy FIN. This is because in any unfolding sequence of clauses Gi,...,C'„ 
where C\+i is obtained from Ci via unfolding, condition 1 of Definition 5 ensures 
that the selected atom each Ci is distinct, and condition 2 ensures that there 
are only finitely many atoms which can ever be selected for unfolding. 

Therefore, the length of each predicate equivalence proof itself is finite (as- 
suming folding always reduces program size which can be ensured). However, 
a proof for p = p' may require q = q' as a, lemma, whose proof in turn may 
require r = r' as a lemma, and so on. Since the number of distinct equivalences 
are quadratic in the number of predicate symbols in the program, the number 
of subproofs is finite if the number of new predicates names introduced is finite. 
Thus, we have : 

Lemma 5 Prove (refer Figure 1) terminates provided the number of definitions 
introduced (\.e. new predicate symbols added) is finite. 

Efficiency of unfolding/folding The algorithmic framework Prove does not 
clarify how we implement unfold{P) and fold{P), i.e. the heuristics for choos- 
ing unfolding/folding steps. These heursitics are extremely important for the 
purposes of efficient proof construction via program transformation. Full details 
of these heuristics appear in [41, 44]. 
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algorithm Prove{p,p'\ predicates, F:prog. seq.) 

begin 

if proo f -attempt (p,p') is marked then return false 
mark proo f -attempt (p,p') 
let r = Po, . . . , Pi 
(* Ax rule *) 

p. 

if (p ~ p' V proved{p,p')) then 
return true 

else nondeterministic choice 
(* Tx rule *) 

case FIN{{r, unfold(Pi))): (* Unfolding *) 
return Prove(p,p' , {P, unfold(Pi))) 
case Folding is possible in Pi'. 
return Prove{p,p' , {P, fold{Pi))) 

(* Gen rule *) 

case Conditional folding is possible in Pi: 
let {Q,Q') = new-goaLequiv-for-fold{Pi) 
return replaee-and-prove(p,p' , Q, Q' , P) 
case Conditional equivalence is possible in Pi: 
let {Q,Q') — new-goal-equiv-for-equiv(p,p' , Pi) 
return replaee-and-prove{p,p' , Q, Q' , P) 
end choices 
mark proved (p,p') 
unmark proof-attempt(p,p') 
end 

Fig. 7. Algorithmic framework for equivalence tableau construction. 



h thm ( X ) = gen ( X ) 



Unfolds 



Defn. Intro, (live' (Y) :-live( [0 |Y] > . 



Goal Replacement 



Unfolds 



live' - live 



thm - gen 



Fig. 8. Liveness Proof of n-bit shift register 
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Proving Predicate Implications Note that the proof system given in Table 1, 
the algorithmic framework Prove and the strategies to guide the transformations 
in Prove are aimed at proving equivalence of program predicates. Our proof 
technique can be readily extended to prove predicate implications i.e. proof 
obligations of the form 

V ground substitutions 9 p{X)9 G M{Pq) p'{X)9 G M{Pq) 

This extension involves (1) relaxing the definition of syntactic equivalence 
(Definition 4) to test for implications only, and (2) generating conditions of the 
form q ^ q' hy applying conditional folding and conditional equivalence. 

7 An Example Proof 

Recall the logic program of Figure 1 (page 266) which formulates a liveness prop- 
erty about token-passing chains, namely, that the token eventually reaches the 
left-most process in any arbitrarily long chain. We obtain Pq, the starting point 
of our transformation sequence from the encoding of the verification problem 
in Figure 1. To establish the liveness property, we prove that Pq F thm(X) = 
gen(X), by invoking Prove (thm, gen, {Pq})- The proof is illustrated in Figure 8. 

Proof of Pq F thm = gen: Since thm ^ gen, we must transform the predicates. 
By repeatedly unfolding the definition of thm in Pq , we obtain program P5 where 
thm is defined as: 

thm( [1] ) . 

thm([0|X]) gen(X), X = [1|_] . 

thm([0|X]) gen(X) , trans(X,Y), live([0|Y]). 

Further unfolding in P5 is not possible since it involves unfolding an atom which 
is already unfolded in the sequence Pq, . . . , P5, thereby risking non-termination. 
In addition no folding transformation is applicable at this stage. However, if 
VY live([0|Y]) live(Y) we can fold the last two clauses of thm. Thus, 
conditional folding is true at P5, and hence replacc-andjprove is invoked with 
Q = live([0|Y]) and Q' = live(Y). Since live([0|Y]) is not an open atom, 
a new name: 

live’ (Y) live( [0|Y]) . 

is added to P5 to yield Pq. This simply converts the goal equivalence problem 
of showing VY live ( [0 1 Y] ) live (Y) to a predicate equivalence problem. We 
fold the third clause of thm above using the newly introduced clause as folder, 
obtaining P7: 

thm( [1] ) . 

thm([0|X]) gen(X), X = [1|_] . 

thm([0|X]) gen(X) , trans(X,Y), live’(Y). 

We then proceed to show Pq F live’ = live. This subproof is shown in the left 
branch of the tree in Figure 8). Then we replace live’ (X) with live(X) in the 
definition of thm in P7 (right branch in Figure 8). 
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Proof of Pq h live’ = live: Prove(live ’ , live, (Pq)) performs a series of 
unfoldings, yielding programs Pg, Pg and Pig- Any further unfolding involves 
unfolding an atom already unfolded in the sequence Po,P8,Pg,Pio and risks 
non-termination. In Pig, live’ is defined by the following clauses: 

live’ ( [1 1 Z] ) . 

live’(X) trans(X,Z), live([0|Z]). 

Folding is applicable is Pig, in the second clause of live’, yielding Pn with 
live’ ( [1 1 Z] ) . 

live’(X) trans(X,Z), live’(Z). 

Now, live’ live and hence Prove{l±ve’ , live, (Pg)) terminates. We assume 
that all occurrences of the equality predicate in the clause bodies are removed 
(via unification) prior to any syntactic equivalence check. 

Resuming Proof of Pg h thm = gen: Now replace-andjprove replaces live ’ (X) 
with live(X) in the definition of thm in Py, yielding P12 with: 

thm( [1] ) . 

thm([0|X]) gen(X), X = [1|_] . 

thm([0|X]) gen(X) , trans(X,Y), live(Y). 

We can now fold the last two clauses of thm using the definition of live in Pg. 
Note that the folding uses a recursive definition of a predicate with multiple 
clauses. The program-transformation system developed by us in [ 43 ] was the 
first to permit such folding. Thus we obtain P13: 

thm( [1] ) . 

thm([0|X]) gen(X) , live(X) . 

This completes the conditional folding step (which had invoked replace-undjprove 
and thereby constructed live’ = live as a subproof). We can fold again using 
the definition of thm in Pg, giving P14 where thm is defined as: 

thm( [1] ) . 

thm([0|X]) thm(X) . 

We now have thm gen, thereby completing the equivalence proof. 

It is interesting to observe in Figure 8 that the unfolding steps that trans- 
form Pg to P5 and Py to Pig are interleaved with folding steps. In other words, 
algorithmic and deductive verification steps are interleaved in the proof of the 
equivalence Pg h thm = gen. 

8 Experiments 

So far, we have presented a tableau based proof system for proving equivalence of 
predicates in a logic program. Furthermore, we presented an algorithmic frame- 
work Prove for guiding the application of the rules in the proof system. However, 
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this algorithmic framework Prove is nondeterministic since at each step several 
transformations may be applicable. Hence it is necessary to develop appropriate 
selection functions to distill concrete strategies from the algorithmic framework. 
Indeed we have implemented such strategies in a predicate equivalence prover 
for verifying parameterized protocols of different network topologies (the com- 
munication pattern between the different constituent processes of a parameter- 
ized network is called its network topology) . Given a parameterized system and 
a liveness/invariant property to be proved, our prover extracts the predicate 
equivalences that need to be established. It tries to use the network topology of 
the parameterized system being verified to construct concrete proof strategies. 
These strategies then guide the proof search which proceeds without any user 
intervention. The proof search is terminating, sound but incomplete {i.e. the 
prover may fail to establish a correct property). A full-fledged discussion of the 
concrete proof strategies (obtained by instantiating the algorithmic framework 
Prove of the last section) appears in [ 4 I, 4-)]. 

In this section, we present the experimental results obtained using our pred- 
icate equivalence prover. The prover is built on top of the XSB tabled logic pro- 
gramming system [50] which supports top-down memoized evaluation of logic 
programs. We report results on parameterized cache coherence protocols, in- 
cluding (a) single bus broadcast protocols e.g. Mesi, (b) single bus protocols 
with global conditions e.g. Illinois and (b) multiple bus hierarchical protocols. 
We also report experimental results for the Java Meta-locking algorithm [1], a 
distributed algorithm to ensure secure access of shared objects by various Java 
threads. The benchmarks cover various network topologies including star, tree 
and complete graph networks. 

Results In Table 2, Meta-lock denotes the Java meta-locking algorithm from 
Sun Microsystems. The Java Meta-Locking Algorithm is a distributed algorithm 
recently proposed by Sun Microsystems to ensure mutually exclusive access of 
shared Java objects by Java threads. A proof of correctness of the algorithm 
involves proving mutual exclusion in the access of a Java object by arbitrary 
number of Java threads. Previously, model checking has been used to verify 
mutual exclusion for different instances of the protocol, obtained by fixing the 
number of threads [5] . We have used our program transformation based prover to 
automatically construct a proof of mutual exclusion for the entire infinite family. 
The sources of infiniteness in the Meta-locking algorithm include (a) unbounded 
number of Java threads, and (b) data variables of infinite domain in the shared 
Java object. 

Mesi and Berkeley RISC are single bus broadcast protocols [4, 15, 18] . Illinois 
is a single bus cache coherence protocol with global conditions which cannot 
be modeled as a broadcast protocol [12]. Tree-cache is a binary tree network 
which simulates the interactions between the cache agents in a hierarchical cache 
coherence protocol [41]. 

Table 2 presents experimental results obtained using our prover: a summary 
of the invariants proved along with the time taken, the number of unfolding 
steps and the number of deductive steps {i.e. folding, conditional equivalence 
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Protocol 


Invariant 


Time (sec) 
in [12] 


Our 

time(secs) 


# Unf #Ded 


Meta-Lock 


T^owner -|- 
^handout < 2 




129.8 


1981 


311 


Mesi 


#m + #e < 2 


1 


3.2 


325 


69 




#m -1- = 0 V 

#s = 0 


0.5 


2.9 


308 


63 


Illinois 


#dirty < 2 


5.3 


35.7 


2501 


137 


Berkeley 


#dirty < 2 


0.6 


6.8 


503 


146 




#ex -1- #sh < 2 


- 


Fails 


- 


- 


Tree-cache 7 ^bus_with_data < 2 


- 


9.9 


178 


18 



Table 2. Summary of protocol verification results 



etc.) performed in constructing the proof. The total time involves time taken 
by (a) unfolding steps (b) deductive steps, and (c) the time to invoke nested 
proof obligations. All experiments reported here were conducted on a Sun Ultra- 
Enterprise workstation with two 336 MHz CPUs and 2 GB of RAM. In the 
table, we have used the following notational shorthand: denotes the number 

of processes in local state s. In column 3 of the table, we have shown the timings 
for the same proofs using the constraint logic program evaluation based checker 
of [12]. The work of [12] was aimed at verifying parameterized cache coherence 
protocols. Note that the timings of [12] were obtained using a Pentium 133 with 
Linux 2.0.32. 

Comparison with CLP-based Verifier [12] The running times of our prover 
are slower than the times for verifying single bus cache coherence protocols re- 
ported in [12]. In fact, there is up to an order of magnitude difference between 
the time taken by our prover and the time taken by the CLP-based verifier. One 
source of the relative inefficiency of our prover arises from the way the proof steps 
are applied. The prototype implementation of our prover implements both the 
unfolding and folding steps via meta-programming. While meta-programming 
appears inevitable to perform the folding steps, the unfolding steps can be per- 
formed directly at the level of the underlying evaluation engine. Such an im- 
plementation would improve running times, can be directly compared to the 
CLP-based verifier, and can be used to evaluate whether the overheads due to 
folding steps in our prover exceed the overheads due to constraint solving in the 
CLP-based verifier. Also, the abstraction technique of [12] is not suitable for 
parameterized tree networks such as Tree-cache, which can be verified by our 
inductive proof technique. 

Comparison across Benchmarks Note that the number of deductive steps 
in a proof is consistently small compared to the number of unfolding steps. This 
is owing to our proof search strategy which repeatedly applies unfolding steps 
until none are applicable. Furthermore, note that the tree network example con- 
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sumes larger running time with fewer unfolding and deductive steps as compared 
to other cache coherence protocols like the Mesi protocol. Due to its network 
topology, the state representation in the tree network has a different term struc- 
ture than the other protocols (where the global states are typically represented 
as lists). This partially accounts for the increase in the running time. In addition, 
certain deductive steps (such as conditional equivalence) employ more expensive 
search heuristics for the tree topology. Finally, the Java meta-locking algorithm 
represents global states as lists, but involves nested induction over both con- 
trol and the data of the protocol thereby increasing the number of predicate 
equivalence proof obligations. Extra proof obligations are incurred due to nested 
induction on the infinite data domain thereby increasing the time to construct 
the proof. 

9 Discussion 

In this chapter, we have presented a technique for proving predicate equivalences 
in a definite logic program. This is used for verifying infinite-state concurrent 
systems, in particular the class of parameterized concurrent systems. We have 
described how the parameterized system verification problem can be reduced to 
proving equivalence of logic program predicates. First we review related work on 
using logic program transformations to construct proofs. 



9.1 Related Work 

Unfold/fold transformations of logic programs have been widely used for program 
specialization and optimization. Relatively little work has been done on using 
these transformations for constructing proofs. Unfold/fold transformations can 
be used to construct induction proofs of program properties. In such proofs, 
unfolding accomplishes the base case and the finite part of the induction step, 
and folding roughly corresponds to application of induction hypothesis. This 
observation has been exploited in [23, 25, 27, 36, 37] to construct inductive 
proofs of program properties. 

Hsiang and Srivas in [23] extended Prolog’s evaluation with “limited forward 
chaining” to perform inductive theorem proving. This limited forward chaining 
step is in fact a very restricted form of folding: only the theorem statement 
(which is restricted to be conjunctive) can be used as a folder clause. The works 
of [25, 27] is closer to ours. They proved certain first order theorems about the 
Least Herbrand Model of a definite logic program via induction. In particular, 
they observed that the least fixed point semantics of logic programs could be 
exploited to employ fixed point induction. Our usage of the transformations is 
similar. Given a program P we intend to prove p = q in the Least Herbrand 
Model of P. To do this proof by induction, we transform p and q to obtain a 
program P' . If the transformed definitions of p and q in P' are “syntactically 
equivalent” (Definition 4) then our proof is finished. Note that the syntactic 
equivalence check is in fact an application of fixed point induction. It allows us 
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to show p = <7 in M{P') (the least Herbrand model of P'). Furthermore, since 
M{P') = M{P) this amounts to showing p = qin program P. Thus, in our work 
predicates are transformed to facilitate the construction of induction schemes 
(for proving predicate equivalence) . [25] also exploits transformations for similar 
purposes. However, their method performs conjunctive folding using only a single 
non-recursive clause. Apart from the restriction in their folding rule, they also 
do not employ goal replacement in their induction proofs. 

The idea of using logic program transformations for proving goal equiva- 
lences was explored by Pettorossi and Proietti in [36, 37]. These works employ 
more restricted unfold/fold transformation rules e.g. folding using non-recursive 
clauses. In [19], a proof technique based on transformation of constraint logic 
programs was proposed. It is used to verify safety properties of systems with ar- 
bitrary number of (potentially infinite state) process. Unlike our work, the proof 
technique of [19] is not based on mathematical induction. Instead it produces 
uniform proofs by abstracting away the number of processes. The work of [12] 
proves safety properties of parameterized systems via evaluation of constraint 
logic programs (the evaluation includes acceleration techniques to ensure termi- 
nation). Partial deduction and abstract interpretation of logic programs is used 
in [29] for proving safety properties of infinite state systems; these techniques 
can be applied to parameterized families as well. 

The reader might notice similarities between a proof system based on un- 
fold/fold transformations and a proof system based on tabled resolution [7, 48]. 
Tabled resolution combines resolution proofs with memoing of calls and answers. 
Since folding corresponds to remembering the original definition of predicates, 
there is some correspondence between folding and memoing. However, folding 
can remember conjunctions and/or disjunctions of atoms as the definition of a 
predicate. This is not possible in tabled resolution. Furthermore, in tabled res- 
olution when a tabled call C is encountered, the answers produced so far for C 
are used to produce new answers for C. In folding, when the clause bodies in old 
definition of a predicate is encountered, it is replaced by the clause head. 

We note that there is a lot of research work on using logic program transfor- 
mations for optimization and/or partial evaluation [35]. Furthermore, the area 
of automated inductive theorem proving has substantial literature of its own [6] . 
These works are not discussed here. Instead, we have concentrated only on tech- 
niques which extend logic program evaluation for proving program properties. 

9.2 Summary 

In a broader perspective, our proof technique is geared to automate nested in- 
duction proofs, where each induction proceeds without hypothesis strengthening. 
Furthermore, the induction schema as well as the requisite lemmas should be im- 
plicitly encoded in the logic program itself. We have employed our lightweight 
inductive proof technique for verifying a specific class of infinite state concurrent 
systems: parameterized systems. Such systems occur widely in computing since 
many distributed algorithms in telecommunication and information processing 
applications constitute a parameterized concurrent system. We have used our 
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proof technique to verify parameterized networks of various interconnection pat- 
terns: chain, ring, tree, star and complete graph networks. A prover based on 
our technique has been used to verify design invariants of real-life distributed 
algorithms such as the recently developed Java meta-locking algorithm from Sun 
Microsystems [1]. 

Our program transformation based proof technique unifies algorithmic and 
deductive verification steps {i.e. model checking and theorem proving steps) 
in a framework. Essentially the proof technique amounts to integrating limited 
deductive steps by enhancing the search based evaluation of a model checker. 
This is different from the traditional way of integrating model checking and 
theorem proving where a model checker is incorporated as a decision procedure 
into a theorem prover [39] . 

The reader should however note that unlike model checking, our inductive 
proof technique currently does not generate counter-example evidence. Due to 
the undecidability of parameterized system verification, the problem of counter- 
example generation is more involved. This is because a proof attempt may fail 
due to either the temporal property being false or the inability of the proof 
system to construct a proof. The problem of navigating and explaining proof 
attempts has been studied for interactive theorem provers [24]. For our trans- 
formation based automated prover, we can a-posteriori provide explanation of 
success/failure of a proof attempt. In particular, we can provide to the user 
the tree of predicate equivalence proof obligations constructed. Once a node in 
this tree is selected (by the user), we can provide snapshots of the transforma- 
tion sequence constructed which led to the success/failure of the proof attempt 
of that predicate equivalence. Developing tools and techniques for explaining 
transformation based proof runs is an attractive topic of future work. 

In conclusion, we would like highlight some interesting aspects of our pro- 
posed integration (of algorithmic and deductive verification). First, the proof 
technique thus obtained allows arbitrary interleaving of algorithmic and deduc- 
tive steps in a proof. In contrast, by incorporating model checking as a decision 
procedure into a theorem prover, the model checker is always invoked as a sub- 
routine. Secondly, the integration is not only tight but also extensible for verifi- 
cation of different flavors of concurrent systems. Our transformation based proof 
technique is a flexible extension of model checking via logic program evaluation 
(since one of our transformations correspond to logic program evaluation). By 
extending the underlying programming language to constraint logic programs 
one can verify families of timed systems with similar proof techniques. Finally, 
note that the proof technique supports zero overhead theorem proving [45] . Con- 
current systems which can be verified without deductive reasoning (such as finite 
state and data independent systems) are verified via model checking since the 
deductive transformations are applied lazily. 
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Abstract. We propose a set of transformation rules for constraint logic 
programs with negation. We assume that every program is locally strati- 
fied and, thus, it has a unique perfect model. We give sufficient conditions 
which ensure that the proposed set of transformation rules preserves the 
perfect model of the programs. Our rules extend in some respects the 
rules for logic programs and constraint logic programs already consid- 
ered in the literature and, in particular, they include a rule for unfolding 
a clause with respect to a negative literal. 



1 Introduction 

Program transformation is a very powerful methodology for developing correct 
and efficient programs from formal specifications. This methodology is particu- 
larly convenient in the case of declarative programming languages, where pro- 
grams are formulas and program transformations can be viewed as replacements 
of formulas by new, equivalent formulas. 

The main advantage of using the program transformation methodology for 
program development is that it allows us to address the correctness and the effi- 
ciency issues at separate stages. Often little effort is required for encoding formal 
specifications (written by using equational or logical formalisms) as declarative 
programs (written as functional or logic programs) . These programs are correct 
by construction, but they are often computationally inefficient. Here is where 
program transformation comes into play: from a correct (and possibly inefficient) 
initial program version we can derive a correct and efficient program version by 
means of a sequence of program transformations that preserve correctness. We 
say that a program transformation preserves correctness, or it is correct, if the 
semantics of the initial program is equal to the semantics of the derived program. 

A very popular approach followed when applying the program transformation 
methodology, is the one based on transformation rules and strategies [9]: the rules 
are elementary transformations that preserve the program semantics and the 
strategies are (possibly nondeterministic) procedures that guide the application 
of transformation rules with the objective of deriving efficient programs. Thus, a 
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program transformation is realized by a sequence Pq, . . . , of programs, called 
a transformation sequence, where, for t = 0, . . . ,n— 1, Pk+i is derived from Pk 
by applying a transformation rule according to a given transformation strategy. 
A transformation sequence is said to be correct if the programs Pq, . . . , Pn have 
the same semantics. 

Various sets of program transformation rules have been proposed in the liter- 
ature for several declarative programming languages, such as, functional [9,39], 
logic [44], constraint [7,11,27], and functional-logic languages [1]. In this paper 
we consider a constraint logic programming language with negation [19,28] and 
we study the correctness of a set of transformation rules that extends the sets 
which were already considered for constraint logic programming languages. We 
will not deal here with transformation strategies, but we will show through some 
examples (see Section 5) that the transformation rules can be applied in a rather 
systematic (yet not fully automatic) way. 

We assume that constraint logic programs are locally stratified [4,35]. This 
assumption simplifies our treatment because the semantics of a locally stratified 
program is determined by its unique perfect model which is equal to its unique 
stable model, which is also its unique, total well-founded model [4,35]. (The def- 
initions of locally stratified programs, perfect models, and other notions used in 
this paper are recalled in Section 2.) 

The set of transformation rules we consider in this paper includes the unfold- 
ing a,nd folding rules (see, for instance, [7,11,16,17,23,27,29,31,37,38,40,42,43,44]). 
In order to understand how these rules work, let us first consider propositional 
programs. The definition of an atom a in a program is the set of clauses that 
have a as head. The atom a is also called the definiendum. The disjunction of the 
bodies of the clauses that constitute the definition of a, is called the definiens. 
Basically, the application of the unfolding rule consists in replacing an atom oc- 
curring in the body of a clause by its definiens and then applying, if necessary, 
some suitable boolean laws to obtain clauses. For instance, given the following 
programs P\ and P 2 ~. 

P\: p^qAr P 2 '- p^^a A r 

q ^ p ^ b Ar 

q ^ b q ^ 

q^b 

we have that by unfolding the first clause of program P\ we get program P 2 . 

Folding is the inverse of unfolding and consists in replacing an occurrence 
of a definiens by the corresponding occurrence of the definiendum (before this 
replacement we may apply suitable boolean laws). For instance, by folding the 
first two clauses of P 2 using the definition of q, we get program P\. An important 
feature of the folding rule is that the definition used for folding may occur in a 
previous program in the transformation sequence. The formal definitions of the 
unfolding and folding transformation rules for constraint logic programs will be 
given in Section 3. The usefulness of the program transformation approach based 
on the unfolding and folding rules, is now very well recognized in the scientific 
community as indicated by a large number of papers (see [29] for a survey). 
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A relevant property we will prove in this paper is that the unfolding of a clause 
w.r.t. an atom occurring in a negative literal, also called negative unfolding, 
preserves the perfect model of a locally stratified program. This property is 
interesting, because negative unfolding is useful for program transformation, 
but it may not preserve the perfect models (nor the stable models, nor the well- 
founded model) if the programs are not locally stratified. For instance, let us 
consider the following programs Pi and P 2 ~. 

Pi: P2: p^p 

q^^p q^^P 

Program P2 can be obtained by unfolding the first clause of Pi (i.e., by first 
replacing q by the body ~^p of the clause defining q, and then replacing ^^p by 
p). Program Pi has two perfect models: {p} and {g}, while program P2 has the 
unique perfect model {g}. 

In this paper we consider the following transformation rules (see Section 3): 
definition introduction and definition elimination (for introducing and elimi- 
nating definitions of predicates), positive and negative unfolding, positive and 
negative folding (that is, unfolding and folding w.r.t. a positive and a negative 
occurrence of an atom, respectively), and also rules for applying boolean laws 
and rules for manipulating constraints. 

Similarly to other sets of transformation rules presented in the literature 
(see, for instance, [1,7,9,11,27,39,44]), a transformation sequence constructed by 
arbitrary applications of the transformation rules presented in this paper, may 
be incorrect. As customary, we will ensure the correctness of transformation 
sequences only if they satisfy suitable properties: we will call them admissible 
sequences (see Section 4). Although our transformation rules are extensions or 
adaptations of transformation rules already considered for stratified logic pro- 
grams or logic programs, in general, for our correctness proof we cannot rely 
on already known results. Indeed, the definition of an admissible transforma- 
tion sequence depends on the interaction among the rules and, in particular, 
correctness may not be preserved if we modify even one rule only. 

To see that known results do not extend in a straightforward way when 
adding negative unfolding to a set of transformation rules, let us consider the 
transformation sequences constructed by first (1) unfolding all clauses of a defi- 
nition 5 and then (2) folding some of the resulting clauses by using the definition 
6 itself. If at Step (1) we use positive unfolding only, then the perfect model se- 
mantics is preserved [37,42] , while this semantics may not be preserved if we use 
negative unfolding, as indicated by the following example. 

Example 1. Let us consider the transformation sequence Pq,Pi,P 2 , where: 

Pq: p{X)^^q{X) Pi-.p{X)^ X<Q^^q{X) P2: p{X) ^ X <Q ^ p{X) 

g(A)^A>0 g(A)^A>0 g(A)^A>0 

q{X) ^ q{X) q{X) ^ q{X) q{X) ^ q{X) 

Program Pi is derived by unfolding the first clause of Pq w.r.t. the negative literal 
^q{X) (that is, by replacing the definiendum q{X) by its definiens X> OVg(A), 
and then applying De Morgan’s law) . Program P2 is derived by folding the first 
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clause of P\ using the definition p{X) ^ ^q{X) in Pq- We have that, for any 
a < 0, the atom p{a) belongs to the perfect model of Pq, while p{a) does not 
belong to the perfect model of P 2 . 

The main result of this paper (see Theorem 3 in Section 4) shows the correctness 
of a transformation sequence constructed by first (1) unfolding all clauses of 
a (non-recursive) definition 5 w.r.t. a positive literal, then (2) unfolding zero 
or more clauses w.r.t. a negative literal, and finally (3) folding some of the 
resulting clauses by using the definition 5. The correctness of such transformation 
sequences cannot be established by the correctness results presented in [37,42]. 

The paper is structured as follows. In Section 2 we present the basic def- 
initions of locally stratified constraint logic programs and perfect models. In 
Section 3 we present our set of transformation rules and in Section 4 we give 
sufficient conditions on transformation sequences that ensure the preservation 
of perfect models. In Section 5 we present some examples of program derivation 
using our transformation rules. In all these examples the negative unfolding rule 
plays a crucial role. Finally, in Section 6 we discuss related work and future 
research. 

2 Preliminaries 

In this section we recall the syntax and semantics of constraint logic programs 
with negation. In particular, we will give the definitions of locally stratified 
programs and perfect models. For notions not defined here the reader may refer 
to [2,4,19,20,26]. 

2.1 Syntax of Constraint Logic Programs 

We consider a first order language C generated by an infinite set Vars of variables, 
a set Funct of function symbols with arity, and a set Fred of predicate symbols 
(or predicates, for short) with arity. We assume that Fred is the union of two 
disjoint sets: (i) the set Fredc of constraint predicate symbols, including the 
equality symbol =, and (ii) the set Fredu of user defined predicate symbols. 

A term of C is either a variable or an expression of the form /(ti, . . . ,tn), 
where / is an n-ary function symbol and t\, . . . ,tn are terms. An atomic formula 
is an expression of the form p{ti , . . . , t„) where p is an n-ary predicate symbol 
and t\, ... ,tn are terms. A formula of C is either an atomic formula or a formula 
constructed from atomic formulas by means of connectives (->, A, V, , <-^-) 
and quantifiers (3, V). 

Let e be a term, or a formula, or a set of terms or formulas. The set of 
variables occurring in e is denoted by vars(e). Given a formula (p, the set of the 
free variables occurring in p is denoted by FV (tp) . A term or a formula is ground 
iff it does not contain variables. Given a set X = {Xi , . . . , A„} of n variables, 
by VA (p we denote the formula VAi . . . VA„ (p. By V(<^) we denote the universal 
closure of tp, that is, the formula VA tp, where FV (p) = X. Analogous notations 
will be adopted for the existential quantifier 3. 
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A primitive constraint is an atomic formula p{t \, . . . , where p is a predi- 
cate symbol in Predc- The set C of constraints is the smallest set of formulas of C 
that contains all primitive constraints and is closed w.r.t. negation, conjunction, 
and existential quantification. This closure assumption simplifies our treatment, 
but as we will indicate at the end of this section, we can do without it. 

An atom is an atomic formula p(ti, . . . , tn) where p is an element of Predu 
and t\, . . . ,tn are terms. A literal is either an atom A, also called positive literal, 
or a negated atom ^ A, also called negative literal. Given any literal L, hy L 
we denote: (i) ^A, if L is the atom A, and (ii) A, if L is the negated atom 
~^A. A goal is a (possibly empty) conjunction of literals (here we depart from 
the terminology used in [2,26], where a goal is defined as the negation of a 
conjunction of literals). A constrained literal is the conjunction of a constraint 
and a literal. A constrained goal is the conjunction of a constraint and a goal. 

A clause 7 is a formula of the form H <— c/\G, where: (i) PI is an atom, called 
the head of 7 and denoted hd{y), and (ii) c A G is a constrained goal, called the 
body of 7 and denoted bd{y). A conjunction of constraints and/or literals may 
be empty (in which case it is equivalent to true). A clause of the form H c, 
where c is a constraint and the goal part of the body is the empty conjunction 
of literals, is called a constrained fact. A clause of the form H whose body is 
the empty conjunction, is called a fact. 

A constraint logic program (or program, for short) is a finite set of clauses. A 
definite clause is a clause whose body has no occurrences of negative literals. A 
definite program is a finite set of definite clauses. 

Given two atoms pfti, . . . , tn) and p{ui, . . . , Un), we denote by p{ti, . . . , t„) 
= p{u\, . . . ,Un) the constraint: t\ = u\ /\ ... /\ tn = u„. For the notion of 
substitution and for the application of a substitution to a term we refer to 
[2,26]. Given a formula ip and a substitution {Xi/ti , . . . , we denote by 

ip{Xi/ti, . . . ,Xn/tn} the result of simultaneously replacing in (p all free occur- 
rences of Ai, . . . , A„ hy ti, ... ,tn. 

We say that a predicate p immediately depends on a predicate g in a program 
P iff there exists in P a clause of the form p{. . .) ^ B and q occurs in B. We 
say that p depends on q rn. P iff there exists a sequence pi, ... ,pn, with n > 1, 
of predicates such that: (i) pi = p, (ii) = q, and (iii) for i = 1 , . . . ,n—l, pi 
immediately depends on Pi+\. Given a user defined predicate p and a program 
P, the definition of p in P, denoted Def{p, P), is the set of clauses 7 in P such 
that p is the predicate symbol of hd{y). 

A variable renaming is a bijective mapping from Vars to Vars. The applica- 
tion of a variable renaming p to a formula p returns the formula p(ip), which is 
said to be a variant of ip, obtained by replacing each (bound or free) variable 
occurrence A in by the variable p(A). A variant of a set {pi, . . . , ipn} of formu- 
las is the set {p{ip\), . . . , p{ipn)}, also denoted p{{ipi, . . . , Pn})- During program 
transformation we will feel free to silently apply variable renamings to clauses 
and to sets of clauses because, as the reader may verify, they preserve program 
semantics (see Section 2.2). Moreover, we will feel free to change the names of the 
bound variables occurring in constraints, as usually done in predicate calculus. 
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2.2 Semantics of Constraint Logic Programs 

In this section we present the definition of the semantics of constraint logic 
programs with negation. This definition extends similar definitions given in the 
literature for definite constraint logic programs [19] and logic programs with 
negation [4,35]. 

We proceed as follows: (i) we define an interpretation for the constraints, 
following the approach used in first order logic (see, for instance, [ 2 ]), (ii) we 
introduce the notion of 'D-model, that is, a model for constraint logic programs 
which is parametric w.r.t. the interpretation T> for the constraints, (iii) we intro- 
duce the notion of locally stratified program, and finally, (iv) we define the perfect 
V-model (also called perfect model, for short) of locally stratified programs. 

An interpretation V for the constraints consists of: (1) a non-empty set D, 
called carrier, ( 2 ) an assignment of a function /-d: I?” — > D to each n-ary function 
symbol / in Funct, and (3) an assignment of a relation p® over Z9” to each n-ary 
predicate symbol in Predc- In particular, T> assigns the set {{d,d)\d G D} to 
the equality symbol =. 

We assume that I? is a set of ground terms. This is not restrictive because 
we may add suitable 0 -ary function symbols to C. 

Given a formula tp whose predicate symbols belong to Predc, we consider the 
satisfaction relation T> \= ip, which is defined as usual in first order predicate 
calculus (see, for instance, [2]). A constraint c is said to be satisfiable iff its 
existential closure is satisfiable, that is, T> ]= 3(c). liV ^ 3(c), then c is said to 
be unsatisfiable in T>. 

Given an interpretation T> for the constraints, a T)- interpretation I assigns 
a relation over D" to each n-ary user defined predicate symbol in Predu, that 
is, / can be identified with a subset of the set B-u of ground atoms defined as 
follows: 

= {p{d\, ■ ■ ■ , dn) I p is a predicate symbol in Predu and {di,. . . , dn) G D"}. 

A valuation is a function v: Vars D. We extend the domain of a valuation 
V to terms, constraints, literals, and clauses as we now indicate. Given a term 
t, we inductively define the term v(t) as follows: (i) if t is a variable X then 
v{t) = v{X), and (ii) if t is /(ti, ■ ■ ■ ,tn) then v{t) = f-v{v{ti), . . . , v(t„)). Given 
a constraint c, v{c) is the constraint obtained by replacing every free variable 
X G FV{c) by the ground term u(A). Notice that v{c) is a closed formula which 
may be not ground. Given a literal L, (i) if L is the atom p(ti, . . . , t„), then v{L) 
is the ground atom p{v{ti), . . . , u(t„)), and (ii) if L is the negated atom ^A, then 
v{L) is the ground, negated atom ^u(A). Given a clause 7 : i? ^ cALiA. . .ALm, 
v{'j) is the clause v{P[) <— v{c) A v{Li) A ... A v{Lm). 

Let / be a P-interpretation and v a valuation. Given a literal L, we say that 
v{L) is true in I iff either (i) L is an atom and v{L) G I, or (ii) L is a negated 
atom and v(A) ^ I. We say that the literal v(L) is false in I iff it is not 
true in /. Given a clause 7 : FI ^ c A L\ A ... A Lm, 0 ( 7 ) is true in I iff either 
(i) v{P[) is true in I, or (ii) T> ^ v{c), or (iii) there exists i G {1, • ■ • ,m} such 
that v{Li) is false in I. 
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A P-interpretation / is a 'D-model of a program P iff for every clause 7 in P 
and for every valuation v, we have that ^( 7 ) is true in I. It can be shown that 
every definite constraint logic program P has a least P-model w.r.t. set inclusion 
(see, for instance [ 20 ]). 

Unfortunately, constraint logic programs which are not definite may fail to 
have a least P-model. For example, the program consisting of the clause p <— 
has the two minimal (not least) models {p} and {g}. This fact has motivated 
the introduction of the set of locally stratified programs [4,35]. For every locally 
stratified program one can associate a unique (minimal, but not least, w.r.t. set 
inclusion) model, called perfect model, as follows. 

A local stratification is a function a: Bv W, where W is the set of countable 
ordinals. If A G Bt> and a{A) is the ordinal a, we say that the stratum of A 
is a. Given a clause 7 in a program P, a valuation v, and a local stratification 
cr, we say that a clause ^( 7 ) of the form: P[ ^ c A Li A . . . A Lm is locally 
stratified w.r.t. cr iff either T> \= or, for f = 1, . . . , m, if Li is an atom A then 
cr(iL) > cr(A) else if Li is a negated atom ~^A then cr(iL) > cr{A). Given a local 
stratification a, we say that program P is locally stratified w.r.t. a, or cr is a 
local stratification for P, iff for every clause 7 in P and for every valuation v, 
the clause ^( 7 ) is locally stratified w.r.t. a. A program P is locally stratified iff 
there exists a local stratification a such that P is locally stratified w.r.t. a. For 
instance, let us consider the following program Even: 

evenfiS) ^ 

even{X) ^ X = Y +1 A ~^even{Y) 

where the interpretation for the constraints is as follows: ( 1 ) the carrier is the set 
of the natural numbers, and ( 2 ) the addition function is assigned to the function 
symbol +. The program Even is locally stratified w.r.t. the stratification function 
cr such that for every natural number n, a{even(n)) = n. 

The perfect model of a program P which is locally stratified w.r.t. a stratifi- 
cation function cr is the least P-model of P w.r.t. a suitable ordering based on 
cr, as specified by the following definition. This ordering is, in general, different 
from set inclusion. 

Definition 1. {Perfect Model) [35]. Let P he a locally stratified program, let a 
be any local stratification for P, and let I, J he T> -interpretations. We say that 
I is preferable to J , and we write I ^ J iff for every Ai G I — J there exists 
A 2 G J — I such that cr(Ai) > ( 7 (^ 2 ). A V-model M of P is called a perfect 
P-model (or a perfect model, for short) iff for every V-model N of P different 
from M , we have that M ^N. 

It can be shown that the perfect model of a locally stratified program always 
exists and does not depend on the choice of the local stratification function cr, 
as stated by the following theorem. 

Theorem 1. [35] Every locally stratified program P has a unique perfect model 
M{P). 
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By Theorem 1, M{P) is the least I? -model of P w.r.t. the A ordering. For 
instance, the perfect model of the program consisting of the clause p ^ is 
{p} because a{p) > a{q) and, thus, the T>-model {p} is preferable to the T>-model 
{g} (i.e., {p} A {g} ). Similarly, it can be verified that the perfect model of the 
program Even is M{Even) = {even{n) | n is an even non-negative integer}. In 
Section 4 we will provide a method for constructing the perfect model of a locally 
stratified program based on the notion of proof tree. 

Let us conclude this section by showing that the assumption that the set C of 
constraints is closed w.r.t. negation, conjunction, and existential quantification 
is not really needed. Indeed, given a locally stratified clause H ^ c A G, where 
the constraint c is written by using negation, or conjunction, or existential quan- 
tification, we can replace iL ^ c A G by an equivalent set of locally stratified 
clauses. For instance, if c is 3X d then we can replace H ^ c A G by the two 
clauses: 

p[ ^ newp{Yi, . . . , Yn) A G 

newp{Yi, .. .,Y„) ^ d 

where newp is a new, user defined predicate and {Yi,...,I^| = FV{3Xd). 
Analogous replacements can be applied in the case where a constraint is written 
by using negation or conjunction. 



3 The Transformation Rnles 

In this section we present a set of rules for transforming locally stratified con- 
straint logic programs. We postpone to Section 6 the detailed comparison of our 
set of transformation rules with other sets of rules which were proposed in the 
literature for transforming logic programs and constraint logic programs. The 
application of our transformation rules is illustrated by simple examples. More 
complex examples will be given in Section 5. 

The transformation rules are used to construct a transformation sequence, 
that is, a sequence Pq, ... ,Pn of programs. We assume that Pq is locally strat- 
ified w.r.t. a fixed local stratification function a: Bx> W, and we will say 
that Po, ... ,Pn is constructed using a. We also assume that we are given a set 
Predint C Predu of predicates of interest. 

A transformation sequence Pq, . . . , is constructed as follows. Suppose that 
we have constructed a transformation sequence Pq, . . . , Pfe, for 0 < fc < n — 1, the 
next program Pk+i in the transformation sequence is derived from program P^ 
by the application of a transformation rule among Rl-RlO defined below. 

Our first rule is the definition introduction rule, which is applied for intro- 
ducing a new predicate definition. Notice that by this rule we can introduce a 
new predicate defined by to (> 1) non-recursive clauses. 

Rl. Definition Introduction. Let us consider to (> 1) clauses of the form: 



(5i : newp{Xi, . 


■ ■ 7 ^h) ^ 


— Cl A G\ 


8m : newp{Xi, . 


■ ■ 7 3ih) ^ 


— Cm A Gi 
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where: 

(1) newp is a predicate symbol not occurring in {Pg, . . . , Pk}, 

(ii) Xi,. . . , Xh are distinct variables occurring in FV ({ci A Gi , . . . , Cm A Gm}), 

(iii) every predicate symbol occurring in {Gi, . . . , Gm} also occurs in Pg, and 

(iv) for every ground substitution i!) with domain {Xi, . . . , Xh}, 
a{newp{Xi, . . . , Xh)'d) is the least ordinal a such that, for every valuation v and 
for every i = 1, . . . , m, 

either (iv.l) T> ^ —•v{ci'd) or (iv.2) for every literal L occurring in v(Gi'd), if L 
is an atom A then a>a{A) else if L is a negated atom ^A then a>a(A). 

By definition introduction (or definition, for short) from program P^ we derive 
the program Pk+i = PfcU{i5i, . . . , 5m}- For A: > 0, Defs^ denotes the set of clauses 
introduced by the definition rule during the transformation sequence Pq, . . . , Pk- 
In particular, Dc/sq = 0. 

Condition (iv), which is needed to ensure that cr is a local stratification for 
each program in the transformation sequence Pq, - - - ,Pk+i (see Proposition 1), 
is not actually restrictive, because newp is a predicate symbol not occurring 
in Pq and, thus, we can always choose the local stratification a for Pq so that 
Condition (iv) holds. As a consequence of Condition (iv), a{newp{Xi, . . . , Xh)'&) 
is the least upper bound of Sp U Sn w.r.t. < where: 

Sp = {<j{A) \^<i<m, V is a valuation, A occurs in u(Gi'd), 

V \= v{cid)}, and 

Sn = {ct(A) + 1 I l<f<TO, V is a valuation, ~^A occurs in v{Gi'd), 

V ^ u(cid)}. 

In particular, if for f = 1, . . . , m, I? |= ^3(cirA), then S'p U S'„ = 0 and we have 
that a{newp{Xi, . . . , Xh)id) = 0. 

The definition elimination rule is the inverse of the definition introduction 
rule. It can be used to discard from a given program the definitions of predicates 
which are not of interest. 

R2. Definition Elimination. Let p be a predicate such that no predicate of 
the set Predint of the predicates of interest depends on p m. Pk- By eliminating 
the definition of p, from program Pk we derive the new program Pk+i = Pk — 
Def{p,Pk)- 

The unfolding rule consists in: (i) replacing an atom p{t\, . . . ,tm) occur- 
ring in the body of a clause, by a suitable instance of the disjunction of the 
bodies of the clauses which are the definition of p, and (ii) applying suitable 
boolean laws for deriving clauses. The suitable instance of Step (i) is computed 
by adding a constraint of the form p{t\, . . . ,tm) = K for each head AT of a clause 
in Def{p, Pk)- There are two unfolding rules: (1) the positive unfolding rule, and 

(2) the negative unfolding rule, corresponding to the case where p{t\, . . . ,tm) 
occurs positively and negatively, respectively, in the body of the clause to be 
unfolded. In order to perform Step (ii), in the case of positive unfolding we ap- 
ply the distributivity law, and in the case of negative unfolding we apply De 
Morgan’s, distributivity, and double negation elimination laws. 
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R3. Positive Unfolding. Let j : H ^ cAGl AAaGr be a clause in program 
Pk and let be a variant of Pk without common variables with 7. Let 

7i : Ki^ciA Bi 
Am ■ -Pm CjTi A -Pm 

where m > 0 and Pi, , Bm are conjunction of literals, be all clauses of program 
Pi. such that, for z = 1, . . . , m, P ^ 3(c A A = Ki A c,). 

By unfolding clause 7 w.r.t. the atom A we derive the clauses 
T]i : P[ ^ c A A = K\ A c\ A G L A B\ A Gr 

Vm ■ P ^ c A A = Kjyi a CjYi A G R A Pm A G R 
and from program Pk we derive the program Pk+i = (Pk — {7}) U {771 , . . . , ?7m}. 

Notice that if m = 0 then, by positive unfolding, clause 7 is deleted from Pr. 

Example 2. Let Pr be the following program: 

1. p{X) ^ X>lAq{X) 

2. q{Y) ^ Y = 0 

3. q{Y) ^ Y = Z+lAq{Z) 

where we assume that the interpretation for the constraints is given by the 
structure TZ of the real numbers. Let us unfold clause 1 w.r.t. the atom q{X). 
The constraint X>lAX = Y AY = Q constructed from the constraints of 
clauses 1 and 2 is unsatisfiable, that is, TZ ^ —AX3Y (X >1 A X = Y AF = 0), 
while the constraint X >1 A X = Y AY = ZY\ constructed from the constraints 
of clauses 1 and 3, is satisfiable. Thus, we derive the following program Pk+i'- 
lu. p{X)^ X>IAX = Y AY = Z+lAq{Z) 

2. q{Y) ^ r = 0 

3. g(F) ^ Y = Z+lAq{Z) 

R4. Negative Unfolding. Let 7: H ^ c A Gr A ^A A Gr be a clause in 
program Pk and let Pi be a variant of Pk without common variables with 7. 
Let 

7i : Pi ^ Cl A Pi 
Am ■ Pm Cm A Pm 

where m > 0 and Pi, ... , Pm are conjunction of literals, be all clauses of program 
Pi such that, for z = 1, . . . ,m, T> ^ 3(c A A = Ki A c,). Suppose that, for 
z = 1 , . . . , m, there exist an idempotent substitution = {3fii/tii, . . . , Xin/Un} 

and a constraint di such that the following conditions hold: 

(i) P ^ V(c ^ {{A = Ki A Cj) ^ {X^i=Ui A ... A Xin = Un A d*))), 

(ii) {Pji, ■ • ■ , Pm} C V^, where U = FV (7*), and 
(hi) FV{di A B,A) C FV{c A A). 

Then, from the formula 

ifo ■ cAGr A^{3Vi (A = Pi A Cl A Pi) V. . .V3Um {A = KmACmABm)) AGr 
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we get an equivalent disjunction of constrained goals by performing the following 
steps. In these steps we silently apply the associativity of A and V. 

Step 1 . {Eliminate 3 ) Since Conditions (i), (ii), and (iii) hold, we derive from ^/>o 
the following equivalent formula: 

ipi : c A Gl a ^((di A V ... V {dm A Bm'&m)) A Gr 

Step 2 . {Push ^ inside) We apply to t/'i as long as possible the following rewrit- 
ings of formulas, where d is a constraint. At is an atom, G, Gi, G2 are goals, 
and D is a disjunction of constrained literals: 

~^{{d A G) V D) — > ^(d A G) A 
~^{d A G) — > ^d V (d A ^G) 

-(Gl A G2) — ^ -Gl V -G2 

~^^At — > At 

Thus, from ipi we derive the following equivalent formula: 

'ip 2 ■ cAGiA(^di V (di A (Liii?i V . . . V Lipi^i))) 

A ... 

A {^dm V {dm A {Lml'dm^ ... V Lmq'dm))) 

AGr 

where Lu A ... A Lip is Bi, . . ., and Lmi A ... A Lmq is Bm- 
Step 3 . {Push V outside) We apply to xp2 as long as possible the following rewrit- 
ing of formulas, where <^i, (p2, and are formulas: 

ifi A {p 2 V pz) — > {ipi A P2) V {ipi A pz) 
and then we move constraints to the left of literals by applying the commutativity 
of A. Thus, from '!/)2 we get an equivalent formula of the form: 

'ij^z ■ (c A Cl A Gl A Qi A Gr) V ... V (c A A Gl A Qr A Gr) 
where ei, . . . , are constraints and Qi, ■ ■ ■ ,Qr are goals. 

Step 4 . {Remove unsatisfiable disjuncts) We remove from -ipz every disjunct (cA 
6 j A Gl L Qj A Gr), with 1 < j < r, such that T> |= ^ 3 (c Aej), thereby deriving 
an equivalent disjunction of constrained goals of the form: 

: (c A Cl A Gl A Qi A Gr) V ... V (c A A Gl A Qs A Gr) 

By unfolding clause 7 w.r.t. the negative literal ^A we derive the clauses 
771 : H <— c A €i A Gl A Qi A Gr 

rja ■. H^cACsAGlAQsAGr 

and from program Pk we derive the program Pk+i = {Pk — {7}) U {771, ...,77^}. 

Notice that: (i) if ttt, = 0 , that is, if we unfold clause 7 w.r.t. a negative literal ~^A 
such that the constraint cAA = KiAci is satisfiable for no clause Ki ^ Ci A Bi in 
P{,, then we get the new program Pk+i by deleting ^A from the body of clause 
7, and (ii) if we unfold clause 7 w.r.t. a negative literal ~^A such that for some 
clause Ki ^ a A Bi in V \= V(c ^ BVi {A = Ki A Ci)) and Bi is the empty 
conjunction, then we derive the new program Pfc-i-i by deleting clause 7 from Pk- 
An application of the negative unfolding rule is illustrated by the following 
example. 
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Example 3. Suppose that the following clause belongs to program Pk'. 

7 : h{X) ^ X>0 A^p{X) 
and let 

p(Y) ^Y = Z+lAZ>OAq{Z) 
p\y) ^Y = Z-\hZ>lhq\z) h^r{Z) 
be the definition of p in P^. Suppose also that the constraints are interpreted in 
the structure TZ of the real numbers. Now let us unfold clause 7 w.r.t. ^p{X). 
We start off from the formula: 

V'O : A:>0A^( 3Y3Z{X = Y AY = Z+lAZ>OAq{Z))\/ 

3Y 3Z {X = Y AY = Z -I A Z >1 A q{Z) A ^r{Z))) 

Then we perform the four steps indicated in rule R4 as follows. 

Step 1. Since we have that: 

7^ ^yXWYWZ {X>0 ^ ( {X = Y AY = Z+1A Z>0) ^ 

{Y = X AZ = X-1AX>1))) 

and 

7^ \=\/X\/Y\/Z{X>0^ ((X = Y AY = Z-1AZ>1) ^ 

{Y = X AZ = X+1))) 

we derive the formula: 

: X>OA^{{X>lAq{X-l))\j{q{X+l)A^r{X+l))) 

Steps 2 and 3. By applying the rewritings indicated in rule R4 we derive the 
following formula: 

V's : (a:>oa^a:>i a^9(x+i))v 
(a:>oa^a:>i Ar(A:+i))v 
(X>0 A A:>1 A -g(A:-l) A ^q{X+l))\/ 
{X>OAX>lA^q{X-l)Ar{X+l)) 

Step 4. Since all constraints in the formula derived at the end of Steps 2 and 3 
are satisfiable, no disjunct is removed. 

Thus, by unfolding h{X) ^ AT > 0 A ^p{X) w.r.t. ^p{X) we derive the following 
clauses: 

h{X) ^ X>OA^X>lA^q{X + l) 

h\x) ^ A:>0 A^A:>1 Ar(Al + l) 

h\x) ^ X>QAX>lA^q\x-l)A^q{X+l) 

h\x) ^ a:>oaa:>i A^g(A:-i) Ar(A:+i) 

The validity of Conditions (i), (ii), and (iii) in the negative folding rule allows 
us to eliminate the existential quantifiers as indicated at Step 1. If these condi- 
tions do not hold and nonetheless we eliminate the existential quantifiers, then 
negative unfolding may be incorrect, as illustrated by the following example. 

Example 4- Let us consider the following programs Pq and Pi, where Pi is ob- 
tained by negative unfolding from Pq, but Conditions (i)-(iii) do not hold: 

Lb: Pi- p^^r{X) 

q^r{X) q^r{X) 

r{X)^X = 0 r{X)^X = 0 
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We have that: p ^ M{Pq) while p G M(Pi). (We assume that the carrier of the 
interpretation for the constraints contains at least one element different from 0 .) 

The reason why the negative unfolding step of Example 4 is incorrect is that 
the clause g <— r(X) is, as usual, implicitly universally quantified at the front, 
and VX (g ^ r(X)) is logically equivalent to g ^ 3Xr(X). Now, a correct 
negative unfolding rule should replace the clause p <— in program Pg by 
p <— ~^3X r(X), while in program Pi we have derived the clause p <— ^r(X) 
which, by making the quantification explicit at the front of the body, can be 
written as p ^ 3X ^r(X). 

The folding rule consists in replacing instances of the bodies of the clauses 
that are the definition of a predicate by the corresponding head. As for unfolding, 
we have a positive folding and a negative folding rule, depending on whether 
folding is applied to positive or negative occurrences of (conjunctions of) literals. 
Notice that by the positive folding rule we may replace m (> 1) clauses by one 
clause only. 

R5. Positive Folding. Let 71 , . . . ,7m, with m > 1, be clauses in Pk and let 
Defs'y. be a variant of Defs/. without common variables with 71 , ... , 7 m. Let the 
definition of a predicate in Defs'j^. consist of the clauses 

(5i : K < — di A B\ 
dm • X < dm A Bm 

where, for i = Bi is a non-empty conjunction of literals. Suppose 

that there exists a substitution 'd such that, for i = clause 7 ^ is of 

the form H ^ c A did A Gl A Bid A Gr and, for every variable X in the set 
FV(di A Bi) — FV(K), the following conditions hold: (i) Xd is a variable not 
occurring in {iL, c, Gl,Gr}, and (ii) Xd does not occur in the term Yd, for any 
variable Y occurring in di A Bi and different from X. 

By folding clauses 71 , ... , 7 m using clauses 5\, . . . ,dm we derive the clause rj: 
H ^ c A Gl A Kd A Gr and from program Pk we derive the program Pk+i = 

(-Pfc-{7i,---,7m})U{j7}. 

The following example illustrates an application of rule R5. 

Example 5. Suppose that the following clauses belong to Pk'. 

71 : h(X) ^ X>1 AY = X-lAp(Y,l) 

72 : h(X) ^ X>lAY = X + lA^g(Y) 

and suppose that the following clauses constitute the definition of a predicate 
new in Defsk- 

(5i: new(Z,C) ^ V = Z — G A p(V,G) 

62 '. new(Z,C) ^ V = Z+G A^g(V) 

For d = \y fY, ZfX, C/\}, we have that 71 = h(X) ^ X >1 A (V = Z — C A 
p(V, G))d and 72 = h(X) ^ A> 1 A (F = Z+G A ~^g(V))d, and the substitution 




304 



Fabio Fioravanti, Alberto Pettorossi, and Maurizio Proietti 



i9 satisfies Conditions (i) and (ii) of the positive folding rule. By folding clauses 
7 i and 72 using clauses i5i and S 2 we derive: 

rj: h{X) ^ X>1 A new{Z,l) 

R6. Negative Folding. Let 7 be a clause in Pk and let Defs'f. be a variant of 
DefSf. without common variables with 7 . Suppose that there exists a predicate 
in Defs'f. whose definition consists of a single clause <5 : K ^ d A A, where A is 
an atom. Suppose also that there exists a substitution d such that clause 7 is of 
the form: H ^ c Ad-ff A Gl A ^A-d A Gr and FV{K) = FV{d A A). 

By folding clause 7 using clause S we derive the clause rj: H ^cA d-d AGrA 
-^K-dAGn and from program Pk we derive the program Pk+i = (^fe— { 7 })U{t 7 }. 

The following is an example of application of the negative folding rule. 

Example 6. Let the following clause belong to Pk- 
7 : /i(A:) ^ AT> 0 Ag(A:) A^r(A:, 0 ) 

and let new be a predicate whose definition in DefSf. consists of the clause: 

(5: new{X,G) ^ X>C Ar{X,C) 

By folding 7 using S we derive: 

rp. h{X) ^ X>0 Aq{X) A^new{X,0) 

The positive and negative folding rule are not fully symmetric for the following 
three reasons. 

(1) By positive folding we can fold several clauses at a time by using several 
clauses whose body may contain several literals, while by negative folding we 
can fold a single clause at a time by using a single clause whose body contains 
precisely one atom. This is motivated by the fact that a conjunction of more 
than one literal cannot occur inside negation in the body of a clause. 

(2) By positive folding, for i = I, ... ,m, the constraint didi occurring in the body 
of clause 7 ^ is removed, while by negative folding the constraint dd occurring in 
the body of clause 7 is not removed. Indeed, the removal of the constraint dd 
would be incorrect. For instance, let us consider the program Pk of Example 6 
above and let us assume that 7 is the only clause defining the predicate h. Let 
us also assume that the predicates q and r are defined by the following two 
clauses: q{X) ^ AT < 0 and r(X, 0) ^ X < 0. We have that h{—l) ^ M{Pk). 
Suppose that we apply the negative folding rule to clause 7 and we remove the 
constraint Al> 0, thereby deriving the clause h{X) ^ q{X)A^new{X, 0), instead 
of clause 77 . Then we obtain a program whose perfect model has the atom h{—l). 

(3) The conditions on the variables occurring in the clauses used for folding are 
less restrictive in the case of positive folding (see Conditions (i) and (ii) of R5) 
than in the case of negative folding (see the condition FV{K) = FV{d A A)). 
Notice that a negative folding rule where the condition FV (K) = FV {d A A) is 
replaced by Conditions (i) and (ii) of R5 would be incorrect, in general. To see 
this, let us consider the following example which may be viewed as the inverse 
derivation of Example 4. 
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Example 1 . Let us consider the following programs Pq, P\, and P2, where P\ 
is obtained from Pq by definition introduction, and P2 is obtained from Pi by 
incorrectly folding p ^ ~^r{X) using q ^ r{Y). Notice that FV{q) ^ FV (r{X)) 
but Conditions (i) and (ii) are satisfied by the substitution {Y/X}. 

Po- p^^r{X) Pi: p^^r{X) P2: p ^ ^q 

r{X)^X = 0 r{X)^X = 0 r{X)^X = 0 

q ^ r(Y) q ^ r{Y) 

We have that: p G M(Pq) while p ^ M{P2). (We assume that the carrier of the 
interpretation for the constraints contains at least one element different from 0.) 

If we consider the folding and unfolding rules outside the context of a transfor- 
mation sequence, either rule can be viewed as the inverse of the other. However, 
given a transformation sequence Pq, . . . , Pn, it may be the case that from a pro- 
gram Pfc in that sequence we derive program Pk+i by folding, and from program 
Pfc+i we cannot derive by unfolding a program Pk+2 which is equal to Pk- This is 
due to the fact that in the transformation sequence Pq, . . . , Pk, Pk+i, in order to 
fold some clauses in program Pk, we may use clauses in Defsk which are neither 
in Pk nor in Pk+i, while for unfolding program Pk+i we can only use clauses 
which belong to Pfc+i. Thus, according to the terminology introduced in [ 29 ], we 
say that folding is, in general, not reversible. This fact is shown by the following 
example. 

Example 8. Let us consider the transformation sequence: 

Po: p^q Pi: p^q P2- P ^ q P3: p ^ r 

q^ q^ q^ q^ 

r <— q r *— r <— 

where Pi is derived from Pq by introducing the definition r <— q, P2 is derived 
from Pi by unfolding the clause r *— q, and P3 is derived from P2 by folding 
the clause p *— q using the definition r ^ q. We have that from program P3 we 
cannot derive a program equal to P2 by applying the positive unfolding rule. 

Similarly, the unfolding rules are not reversible in general. In fact, if we derive a 
program Pk+i by unfolding a clause in a program Pk and we have that DefSk = 0, 
then we cannot apply the folding rule and derive a program Pk+2 which is equal 
to Pfc, simply because no clause in DefSk is available for folding. 

The following replacement rule can be applied to replace a set of clauses with 
a new set of clauses by using laws based on equivalences between formulas. In 
particular, we consider: (i) boolean laws, (ii) equivalences that can be proved 
in the chosen interpretation T> for the constraints, and (iii) properties of the 
equality predicate. 

R7. Replacement Based on Laws. Let us consider the following rewritings 
Pi P2 between sets of clauses (we use Pi P2 as a shorthand for the two 
rewritings Pi P2 and P2 Pi). Each rewriting is called a law. 
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Boolean Laws 



(1) 




- cAAA^AaG] 




0 




(2) 


{H- 


-cAH AG} 




0 




(3) 


{H- 


- c A G\ A A\ A A 2 A G 2 } 




{77 4 


— c A Gi A A 2 A Ai A G 2 } 


(4) 


{H- 


-cAAaAaG} 


=7 


{77 4 


- cAAaG} 


(5) 


{H. 
H 4- 


- c A Gi, 

- c A d A Gi A G 2 } 




{77 4 


- cAGi} 


(6) 


{H. 
H 4- 


— c A A A G, 

- cA A G} 


=7 


{77 4 


- cAG} 



Laws of Constraints 

(7) {H ^ c^G} ^ 

if the constraint c is unsatisfiable, that is, T> |= ^3(c) 

(8) {H ^Cl^G) {H ^C2^ G} 

if I? 1= V (3y Cl 3Z C2), where: 

(i) Y = FV{ci)-FV{{H,G}), and 

(ii) Z=FV{c2)-FV{{H,G}) 



(9) 


{77^ 


c A G} 47 {77 < — Cl A G, 


77 ^ C2 A G} 






if 7? h V (c 44 


(ci V C 2 )) 


Laws 


of Equality 




(10) 


{(77 < 


^ cAG){X/t}} 47 {77^ 


X = tAcAG} 



if the variable X does not occur in the term t 
and t is free for X in c. 

Let Fi and F 2 be sets of clauses such that: (i) A A, and (ii) A is locally 
stratified w.r.t. the fixed local stratification function cr. By replacement from A 
we derive A and from program Ac we derive the program Ac+i = (A — A)UA- 

Condition (ii) on A is needed because a replacement based on laws (1), (2), 
(5), and (7), used from right to left, may not preserve local stratification. For 
instance, the first law may be used to introduce a clause of the form p ^ p A ^p, 
which is not locally stratified. We will see at the end of Section 4 that if we add 
the reverse versions of the boolean laws (4) or (6), then the correctness result 
stated in Theorem 3 does not hold. 

The following definition is needed for stating rule R8 below. The set of useless 
predicates in a program P is the maximal set U of predicate symbols occurring 
in P such that a predicate p is in [/ iff every clause 7 in Def{p, P) is of the 
form H ^ c A Gi A g(. . .) A G2 for some q in Lf . For example, in the following 
program: 

p{X)^q{X)AMX) 
q{X)^p{X) 
r{X) ^ X>0 

p and q are useless predicates, while r is not useless. 
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R8. Deletion of Useless Predicates. If p is a useless predicate in Pk, then 
from program Pk we derive the program Pk+i = Pk — Def{p,Pk). 

Neither of the rules R2 and R8 subsumes the other. Indeed, on one hand 
the definition of a predicate p on which no predicate of interest depends, can be 
deleted by rule R2 even if p is not useless. On the other hand, the definition of 
a useless predicate p can be deleted by rule R8 even if a predicate of interest 
depends on p. 

The constraint addition rule R9 which we present below, can be applied 
to add to the body of a clause a constraint which is implied by that body. 
Conversely, the constraint deletion rule RIO, also presented below, can be applied 
to delete from the body of a clause a constraint which is implied by the rest of 
the body. Notice that these implications should hold in the perfect model of 
program Pk, while the applicability conditions of rule R7 (see, in particular, the 
replacements based on laws 7-9) are independent of Pk- Thus, for checking the 
applicability conditions of rules R9 and RIO we may need a program analysis 
based, for instance, on abstract interpretation [10]. 

R9. Constraint Addition. Let 71 : H ^ cAG he a clause in Pk and let d be a 
constraint such that M{Pk) |= V((c A G) ^ 3 A d), where X = FV{d) — FV{'^i). 
By constraint addition from clause 71 we derive the clause 72 : Ft <— c A d A G 
and from program Pk we derive the program Pk+i = {Pk — { 71 }) U { 72 }- 

The following example shows an application of the constraint addition rule 
that cannot be realized by applying laws of constraints according to rule R7. 

Example 9. Let us consider the following program Pk'. 

1. nat{0) ^ 

2. nat{N) ^ N = M+1 A nat{M) 

Since M{Pk) |= VM {nat{M) ^ M >0), we can add the constraint M > 0 to 
the body of clause 2. This constraint addition improves the termination of the 
program when using a top-down strategy. 

RIO. Constraint Deletion. Let 71 : H^cAdAGhea clause in Pk and 
let d be a constraint such that M{Pk) ^ V((c A G) ^ 3Ad), where X = 
FV (d) — FV {H ^ c A G). Suppose that the clause 72 : H ^ c AG is locally 
stratified w.r.t. the fixed a. By constraint deletion from clause 71 we derive clause 
72 and from program Pk we derive the program Pk+i = {Pk — { 71 }) U { 72 }- 

We assume that 72 is locally stratified w.r.t. a because otherwise, the con- 
straint deletion rule may not preserve local stratification. For instance, let us 
consider the following program P: 
p{X)^ 

p{X) ^ X^X A^p{X) 

P is locally stratified because for all elements d in the carrier of the interpretation 
V for the constraints, we have that I? |= d = d. We also have that M{P) |= 
VA (^p(A) ^ A yf A). However, if we delete the constraint A yf A from the 
second clause of P we derive the clause p{X) ^ ^p{X) which is not locally 
stratified w.r.t. any local stratification function. 
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4 Preservation of Perfect Models 

In this section we present some sufficient conditions which ensure that a trans- 
formation sequence constructed by applying the transformation rules listed in 
Section 3, preserves the perfect model semantics. 

We will prove our correctness theorem for admissible transformation se- 
quences, that is, transformation sequences constructed by applying the rules 
according to suitable restrictions. The reader who is familiar with the program 
transformation methodology, will realize that most transformation strategies 
can, indeed, be realized by means of admissible transformation sequences. In 
particular, all examples of Section 5 are worked out by using this kind of trans- 
formation sequences. 

We proceed as follows, (i) First we show that the transformation rules pre- 
serve local stratification, (ii) Then we introduce the notion of an admissible 
transformation sequence, (iii) Next we introduce the notion of a proof tree for a 
ground atom A and a program P and we show that A G M (P) iff there exists a 
proof tree for A and P. Thus, the notion of proof tree provides the operational 
counterpart of the perfect model semantics, (iv) Then, we prove that given any 
admissible transformation sequence Pq, . . . , P„, any set Predint of predicates of 
interest, and any ground atom A whose predicate is in Predint, we have that for 
k = 0, . . . ,n, there exists a proof tree for A and Pfc iff there exists a proof tree 
for A and PoUPe/s„. (v) Finally, by using the property of proof trees considered 
at Point (iii), we conclude that an admissible transformation sequence preserves 
the perfect model semantics (see Theorem 3). 

Let us start off by showing that the transformation rules preserve the lo- 
cal stratification function a which was fixed for the initial program Pq at the 
beginning of the construction of the transformation sequence. 

Proposition 1. [Preservation of Local Stratification]. Let Pq be a locally strat- 
ified program, let a : Bv ^ W be a local stratification function for Pq, and let 
Pq, . . . ,Pn be a transformation sequence using a. Then the programs Pq, . . . , P„, 
and Pq U Pe/s„ are locally stratified w.r.t. a. 

The proof of Proposition 1 is given in Appendix A. 

An admissible transformation sequence is a transformation sequence that 
satisfies two conditions: (1) every clause used for positive folding is unfolded 
w.r.t. a positive literal, and (2) the definition elimination rule cannot be applied 
before any other transformation rule. An admissible transformation sequence is 
formally defined as follows. 

Definition 2. [Admissible Transformation Sequence] A transformation sequence 
Pq, . . . , P„ is said to be admissible iff the following two conditions hold: 

(1) for k = 0, . . . , n—1, if Pk+i is derived from Pfc by applying the positive folding 
rule to clauses 71, ... , 7^ using clauses (5i, . . . , 5m, then for i = 1, . . . , m there 
exists j, with 0<j<n, such that Si G Pj and program Pj+i is derived from Pj 
by positive unfolding of clause Si, and 
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(2) if for some m<n, Pm+i is derived from Pm by the definition elimination rule 
then for all k = m, , n— 1, Pk+i is derived from Pk by applying the definition 
elimination rule. 

When proving our correctness theorem (see Theorem 3 below), we will find it 
convenient to consider transformation sequences which are admissible and satisfy 
some extra suitable properties. This motivates the following notion of ordered 
transformation sequences. 

Definition 3. [Ordered Transformation Sequence] A transformation sequence 
Pom ■ ■ ,Pn is said to be ordered iff it is of the form: 

Po 1 • • • 1 Pii • • m 5 ■ ■ ■ 5 j ■ ■ ■ j Pn 

where: 

(1) the sequence Pom--,Pi, with i>0, is constructed by applying i times the 
definition introduction rule, that is, Pi = Pq^J Defs^; 

(2) the sequence Pi, ... ,Pj is constructed by unfolding w.r.t. a positive literal each 
clause in DefSi which is used for applications of the folding rule in Pj, ... , Pm,’ 

(3) the sequence Pj, . . . , Pm, with j < m, is constructed by applying any rule, 
except the definition introduction and definition elimination rules; and 

(4) the sequence Pm, • ■ • , Pn, with m<n, is constructed by applying the definition 
elimination rule. 

Notice that in an ordered transformation sequence we have that Defs^ = Defs^ 
Every ordered transformation sequence is admissible, because of Points (2) and 
(4) of Definition 3. Conversely, by the following Proposition 2, in our correctness 
proofs we will assume, without loss of generality, that any admissible transfor- 
mation sequence is ordered. 

Proposition 2. For every admissible transformation sequence Pq, . . . , P„, there 
exists an ordered transformation sequence Qo, ■ ■ ■ ,Qr (with r possibly different 
from n), such that: (i) Pq = Qo, (ii) Pn = Qr, and (Hi) the set of definitions 
introduced during Pq,. . .,Pn is equal to the set of definitions introduced during 
Qo, • ' • , Qr- 

The easy proof of Proposition 2 is omitted for reasons of space. It is based 
on the fact that the applications of some transformation rules can be suitably 
rearranged without changing the initial and final programs in a transformation 
sequence. 

Now we present the operational counterpart of the perfect model semantics, 
that is, the notion of a proof tree. A proof tree for a ground atom A and a locally 
stratified program P is constructed by transfinite induction as indicated in the 
following definition. 

Definition 4. [Proof Tree] Let A be a ground atom, P be a locally stratified 
program, and a be any local stratification for P. Let PT^a be the set of proof 
trees for ground atoms B and P with (j{B) < cr(A). A proof tree for A and P is 
a finite tree T of goals such that: (i) the root ofT is A, (ii) a node N of T has 
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children Li, . . . , iff N is a ground atom B and there exists a clause 7 G P and 
a valuation v such that v(j) is B ^ c A Li A ... A Lr and T> \= c, and (Hi) every 
leaf ofT is either the empty conjunction true or a negated ground atom ^B such 
that there is no proof tree for B and P in PT^a- 

The following theorem establishes that the operational semantics based on proof 
trees is equivalent to the perfect model semantics. 

Theorem 2. [Proof Trees and Perfect Models] Let P be a locally stratified 
program. For all ground atoms A, there exists a proof tree for A and P iff A € 
M{P). 

Our proofs of correctness use induction w.r.t. suitable well-founded measures 
over proof trees, ground atoms, and ground goals (see, in particular, the proofs of 
Propositions 3 and 5 in Appendices B and C). We now introduce these measures. 

Let r be a proof tree for a ground atom A and a locally stratified program 
P. By size{T) we denote the number of atoms occurring at non-leaf nodes of T. 
For any ground atom A, locally stratified program P, and local stratification cr 
for P, we define the following measure: 

^(A, P) = miniex{ff{A), size{T)) | T is a proof tree for A and P} 
where minux denotes the minimum w.r.t. the lexicographic ordering <iex over 
W X N , where W is the set of countable ordinals and N is the set of natural 
numbers. /i(A, P) is undefined if there is no proof tree for A and P. The measure 
fjL is extended from ground atoms to ground literals as follows. Given a ground 
literal L, we define: 

p.{L, P) = if L is an atom A then p.{A, P) 

else if L is a, negated atom ~^A then ((t(A), 0) 

Now we extend p, to ground goals. First, we introduce the binary, associative 
operation 0 : (VF x Nff ^ {W x N) defined as follows: 

(ai,TOi) 0 {a2,m2) = {max{ai,a2), mi 0 m 2 ) 

Then, given a ground goal Li A . . . A L„, we define: 

fi{Li A . . . A L„, P) = yi{Li,P) 0 ... 0 P) 

The measure p, is well-founded in the sense that there is no infinite sequence of 
ground goals Gi, G 2 , . . . such that /r(Gi, P) > /x(G 2 , P) > . . . 

In order to show that an ordered transformation sequence Pq, . . . , Pi, . . . , 
Pj, ... , Pm, ■ ■ ■ ,Pn (where the meaning of the subscripts is the one of Defini- 
tion 3) preserves the perfect model semantics, we will use Theorem 2 and we 
will show that, for k = 0, . . . , n, given any ground atom A whose predicate be- 
longs to the set Predint of predicates of interest, there exists a proof tree for A 
and Pfe iff there exists a proof tree for A and Pq U Pe/s„. Since Pi = Pq U Pe/s„, 
it is sufficient to show the following properties, for any ground atom A: 

(PI) there exists a proof tree for A and Pi iff there exists a proof tree for A and 
Pp 

(P2) there exists a proof tree for A and Pj iff there exists a proof tree for A and 
Pm, and 
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(P3) if the predicate of A is in Predint, then there exists a proof tree for A and 
Pm iff there exists a proof tree for A and P„. 

Property PI is proved by the following proposition. 

Proposition 3. Let Pq be a loeally stratified program and let Pg, . ■ ■ , Pi, ■ ■ ■ , 
Pj, . ■ ■ , Pm, ■ ■ ■ , Pn be an ordered transformation sequenee. Then there exists a 
proof tree for a ground atom A and Pi iff there exists a proof tree for A and Pj . 

The proof of Proposition 3 is given in Appendix B . It is a proof by induction on 
cr(A) and on the size of the proof tree for A. 

In order to prove the only-if part of Property P2, we will show a stronger 
invariant property based on the following consistency notion. 

Definition 5. [P^-consistency] Let Pq, . . . , Pj, . . . , Pj,. ■ ■ , Pm, ■ ■ ■ ,Pn be an or- 
dered transformation sequence, Pk be a program in this sequence, and A be a 
ground atom. We say that a proof tree T for A and Pk is P^-consistent ijf 
for every ground atom B and ground literals Li, . . . , L^, if B is the father of 
Li, . . . ,Lr in T, then pl{B, Pj) > p,{Li A ... A Lr, Pj). 

The invariant property is as follows: for every program Pk in the sequence 
Pj,. . ■ ,Pm, if there exists a P^-consistent proof tree for A and Pj, then there 
exists a Pj-consistent proof tree for A and Pk. 

It is important that P^-consistency refers to the program Pj obtained by 
applying the positive unfolding rule to each clause that belongs to Defs^ and 
is used in Pj, ... , Pm for a folding step. Indeed, if the positive unfolding rule is 
not applied to a clause in Pe/Sj, and this clause is then used (possibly, together 
with other clauses) in a folding step, then the preservation of Pj-consistent proof 
trees may not be ensured and the transformation sequence may not be correct. 
This is shown by Example 1 of the Introduction where we assume that the first 
clause p{X) ^ ~^q{X) of Pq has been added by the definition introduction rule 
in a previous step. 

We have the following. 

Proposition 4. Lf there exists a proof tree for a ground atom A and program 
Pj then there exists a Pj- consistent proof tree for A and Pj. 

Proof. Let T be a proof tree for A and Pj such that {<j{A), size{T)) is minimal 
w.r.t. <iex- Then T is P^ -consistent. □ 

Notice that in the proof of Proposition 4 we state the existence of a Pj -consistent 
proof tree for a ground atom A and program Pj without providing an effective 
method for constructing this proof tree. In fact, it should be noticed that no 
effective method can be given for constructing the minimal proof tree for a given 
atom and program, because the existence of such a proof tree is not decidable 
and not even semi-decidable. 

By Proposition 4, in order to prove Property P2 it is enough to show the 
following Proposition 5. 
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Proposition 5. Let Pq be a loeally stratified program and let Pg, . . . , Pi, . . . , 
Pj, . . . , Pm, ■ ■ ■ , Pn be an ordered transformation sequence. Then, for every 
ground atom A we have that: 

(Soundness) if there exists a proof tree for A and Pm, then there exists a proof 
tree for A and Pj , and 

(Completeness) if there exists a Pj- consistent proof tree for A andPj, then there 
exists a Pj -consistent proof tree for A and Pm- 

The proof of Proposition 5 is given in Appendix C. 

In order to prove Property P3, it is enough to prove the following Propo- 
sition 6, which is a straightforward consequence of the fact that the existence 
of a proof tree for a ground atom with predicate p is determined only by the 
existence of proof trees for atoms with predicates on which p depends. 

Proposition 6. Let P be a locally stratified program and let Predint be a set 
of predicates of interest. Suppose that program Q is derived from program P 
by eliminating the definition of a predicate q such that no predicate in Predint 
depends on q. Then, for every ground atom A whose predicate is in Predint, 
there exists a proof tree for A and P iff there exists a proof tree for A and Q. 

Now, as a consequence of Propositions 1-6, and Theorem 2, we get the following 
theorem which ensures that an admissible transformation sequence preserves the 
perfect model semantics. 

Theorem 3. [Correctness of Admissible Transformation Sequences] Let Pq be 
a locally stratified program and let Po,...,Pn be an admissible transformation 
sequence. Let Predint be the set of predicates of interest. Then Pq U DefSn and 
Pn are locally stratified and for every ground atom A whose predicate belongs to 
Predmt, A G M(Po U DefSn) iff A & M(P„). 

This theorem does not hold if we add to the boolean laws listed in rule R7 of 
Section 3 the inverse of law (4), as shown by the following example. 

Example 10. Let us consider the following transformation sequence: 

Pq: p^ q Aq Pi: p ^ q P 2 : p ^ q A q P 3 : p ^ p 
q^ q^ q^ q^ 

We assume that the clause for p in Pq is added to Pq by the definition introduc- 
tion rule, so that it can be used for folding. Program P\ is derived from Pq by 
unfolding, program P 2 is derived from Pi by replacement based on the reverse 
of law (4), and finally, program P3 is derived by folding the first clause of P 2 
using the first clause of Pq. We have that p € M{Pq), while p ^ M{Pq). 

Analogously, the reader may verify that Theorem 3 does not hold if we add to 
the boolean laws of rule R7 the inverse of law (6). 




Transformation Rules for Locally Stratified Constraint Logic Programs 313 



5 Examples of Use of the Transformation Rules 

In this section we show some program derivations realized by applying the trans- 
formation rules of Section 3. These program derivations are examples of the 
following three techniques: (1) the determinization technique, which is used for 
deriving a deterministic program from a nondeterministic one [14,33], (2) the 
program synthesis technique, which is used for deriving a program from a first 
order logic specification (see, for instance, [18,41] and [6] in this book for a recent 
survey), and (3) the program specialization technique, which is used for deriving 
a specialized program from a given program and a given portion of its input 
data (see, for instance, [21] and [24] for a recent survey). 

Although we will not provide in this paper any automatic transformation 
strategy, the reader may realize that in the examples we will present, there 
is a systematic way of performing the program derivations. In particular, we 
perform all derivations according to the repeated application of the following 
sequence of steps: (i) first, we consider some predicate definitions in the initial 
program or we introduce some new predicate definitions, (ii) then we unfold these 
definitions by applying the positive and, possibly, the negative unfolding rules, 
(iii) then we manipulate the derived clauses by applying the rules of replacement, 
constraint addition, and constraint deletion, and (iv) finally, we apply the folding 
rules. The final programs are derived by applying the definition elimination rule, 
and keeping only those clauses that are needed for computing the predicates of 
interest. 



5.1 Determinization: Comparing Even and Odd Occurrences of a 
List 

Let us consider the problem of checking whether or not, for any given list L 
of numbers, the following property r{L) holds: every number occurring in L in 
an even position is greater or equal than every number occurring in L in an 
odd position. The locally stratified program EvenOdd shown below, solves the 
given problem by introducing a new predicate p{L) which holds iff there is a pair 
{X, Y) of numbers such that X occurs in the the list L in an even position, Y 
occurs in L in an odd position, and X <Y. Thus, for any list L, the property 
r{L) holds iff p{L) does not hold. 

EvenOdd: 

1. r{L) ^ list{L) f\ ^p{L) 

2. p(L) ^/>1 A J>1A A<TA 

occur s{X, /, L) A even{I) A occur s{Y, J, L) A ~^even{J) 

3. even{X)^X = 0 

4. even\x+l) X>0 A ^even{X) 

5. occurs{X,I,[H\T])^I=lAX = H 

6. occurs{X, I+l, [i?|T]) ^ /> 1 A occurs{X, /, T) 

7. ?zst([]) ^ 

8. listl[H\T]) ^ list{T) 
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In this program occurs {X , I , L) holds iff X is the /-th element (with / > 1) of 
the list L starting from the left. When executed by using SLDNF resolution, this 
EvenOdd program may generate, in a nondeterministic way, all possible pairs 
{X,Y), occurring in even and odd positions, respectively. This program has an 
O(n^) time complexity in the worst case, where n is the length of the input list. 

We want to derive a more efficient definite program that can be executed 
in a deterministic way, in the sense that for every constrained goal c A A A G 
derived from a given ground query by LD-resolution [3] there exists at most one 
clause H ^ d A K such that c A A = H A d is satisfiable. 

To give a sufficient condition for determinism we need the following notion. 
We say that a variable Af is a local variable of a clause 7 iff X G FV{bd{'j)) — 
FV{hd{'j)). The determinism of a program P can be ensured by the following 
syntactic conditions: (i) no clause in P has local variables and (ii) any two 
clauses Pli ^ c\ A G\ and i?2 ^ C2 A G2 in P are mutually exclusive, that is, 
the constraint i?i = i?2 A ci A C2 is unsatisfiable. 

Our derivation consists of two transformation sequences. The first sequence 
starts from the program made out of clauses 2-8 and derives a deterministic, 
definite program Q for predicate p. The second sequence starts from QU{1} and 
derives a deterministic, definite program EvenOdd det for predicate r. 

Let us show the construction of the first transformation sequence. Since 
clause 2 has local variables, we want to transform it into a set of clauses that 
have no local variables and are mutually exclusive, and thus, they will constitute 
a deterministic, definite program. We start off by applying the positive unfolding 
rule to clause 2, followed by applications of the replacement rule based on laws 
of constraints and equality. We derive: 

9. p([A|L]) ^ J>1AF<AA occursiY, J, L) A even{J+l) 

10. p{[A\L]) ^ I>lAJ>lAX<YAoccurs{X,I,L)A 

even{I+l) A occurs(Y, J, L) A ~^even{J+l) 

Now, by applications of the positive unfolding rule, negative unfolding, and re- 
placement rules, we derive the following clauses for p: 

11. p{[A,B\L]) ^ B<A 

12. p{[A, B\L]) ^ B>A A I>1 A X <A A occurs{X, I, L) A even{I) 

13. p{[A, B\L]) ^ B> A A I>1 A B < X A occurs (X, I, L) A ~^even{I) 

14. pi\A, B\L\) B> A A I >1 A J >1 A X <Y A occurs (X, I, L)Aeven{I)A 

occurs(Y, J, L) A ^even(J) 

Notice that the three clauses 12, 13, and 14, are not mutually exclusive. In 
order to derive a deterministic program for p, we introduce the following new 
definition: 

15. newl{A, B , L) ^ I >1 A X < A A occurs{X, I, L) A even{I) 

16. newl{A, B, L) ^ I>1 A B <X A occurs{X, I, L) A ~^even{I) 

17. newl{A, B,L) ^ I>1 A J> \ A X <Y A occurs{X, I, L) A even{I)A 

occurs{Y, J, L) A ^even(J) 
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and we fold clauses 12, 13, and 14 by using the definition of newl, that is, clauses 
15, 16, and 17. We derive: 

18. p{[A,B\L]) ^ B>A^newl{A,B,L) 

Clauses 11 and 18 have no local variables and are mutually exclusive. We are left 
with the problem of deriving a deterministic program for the newly introduced 
predicate newl. 

By applying the positive unfolding, negative unfolding, and replacement 
rules, from clauses 15, 16, and 17, we get: 

19. newl{A,B,[C\L\) ^ B<C 

20. newl{A, B, [ClL]) ^ I>1 A B<X A occurs{X, I, L) A even{I) 

21. newl{A, B, [ClL]) ^ I>1 A X <A A occurs{X, I, L) A ^even(I) 

22. newl{A, B, [ClL]) ^/>1AX<CA occurs{X, /, L) A ~^even{I) 

23. newl{A,B,[C\L]) ^ I>1 A J>1 A X <Y A occurs{X, I, L)A 

^even(I) A occurs(V, J, L) A even{J) 

In order to derive mutually exclusive clauses without local variables we first ap- 
ply the replacement rule and derive sets of clauses corresponding to mutually 
exclusive cases, and then we fold each of these sets of clauses. We use the re- 
placement rule based on law (5) and law (9) which is justified by the equivalence: 
vxvr (true X > y V X < r ) . We get: 

24. newl{A,B,[C\L]) ^ B<C 

25. newl{A,B,[C\L]) ^ B>C A A>C A I>1 A B <X A 

occurs{X, I, L) A even{I) 

26. newl{A,B,[C\L]) ^ B>C A A>C A I>1 A X <AA 

occurs(X, I, L) A ~^even{I) 

27. newl{A,B,[C\L]) ^ B>C A A>C A I>1 A J>1 A X <Y A 

occurs{X, I, L) A ~^even{I)A 
occurs(Y, J, L) A even{J) 

28. newl{A,B,[C\L]) ^ B>C A A<C A I>1 A B <XA 

occurs{X, /, L) A even{I) 

29. newl{A,B,[C\L]) ^ B>C A A<C A I>1 A X <C A 

occurs {X, I, L) A ~^even{I) 

30. newl{A,B,[C\L\) ^ B>C A A<C A I>1 A J> \ A X <Y A 

occurs{X, /, L) A —~even{I)A 
occurs(Y, J, L) A even(J) 

The three sets of clauses: {24}, (25, 26, 27}, and (28, 29, 30} correspond to 
the mutually exclusive cases: {B < C), {B > C A A > C), and {B > C A A < 
C), respectively. Now, in order to fold each set (25, 26, 27} and (28, 29, 30} 
and derive mutually exclusive clauses without local variables, we introduce the 
following new definition: 

31. new2{A, B,L) ^ I>1 A B <X A occurs{X, I, L) A even{I) 

32. new2{A, B, L) ^ I>1 A X <A A occurs{X, I, L) A ^even{I) 

33. new2{A, B, L) ^ I>1 A J>1 A X <Y A occurs{X, I, L) A -^even{I)A 

occurs (Y, J,L) A even{J) 
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By folding clauses 25, 26, 27 and 28, 29, 30 using clauses 31, 32, and 33, for 
predicate newl we get the following mutually exclusive clauses without local 
variables: 

34. newl{A,B,[C\L\) ^ B<C 

35. newl\A,B, [C\B\) ^ B>C A A>C A new2{A, B, L) 

36. newl{A,B, [C\L]) ^ B>C A A<C A new2{C, B, L) 

Unfortunately, the clauses for the new predicate new2 have local variables and 
are not mutually exclusive. Thus, we continue our derivation and, by applying 
the positive unfolding, negative unfolding, replacement, and folding rules, from 
clauses 31, 32, and 33 we derive the following clauses (this derivation is similar 
to the derivation that lead from {15, 16, 17} to (34, 35, 36} and we omit it): 

37. new2{A,B,[C\L]) ^ C<A 

38. new2lA,B, [C\L]) ^ C>A A B>C A newl{A,C, L) 

39. new2{A,B,[C\L]) ^ C>A A B <C A newl{A, B, L) 

The set of clauses derived so far starting from the initial clause 2, that is, (11, 
18, 34, 35, 36, 37, 38, 39} constitutes a deterministic program for p, call it Q. 

Now we construct the second transformation sequence starting from Q U {1} 
for deriving a deterministic, definite program for r. We start off by considering 
clause 1 which defines r and, by positive unfolding, negative unfolding, and 
replacement we derive: 

40. r([])^ 

41. r([A]) ^ 

42. r{[A, B\L]) ^ list{L) A B> A A ^newl{A, B, L) 

By introducing the following definition: 

43. new3{A, B, L) ^ list{L) A B> A A ^newl{A, B, L) 

and then folding clause 42 using clause 43, we derive the following definite 
clauses: 

44. r([]) ^ 

45. r([A]) ^ 

46. r\[A^B\L]) ^ B>A A new?>{A,B,L) 

Now, we want to transform clause 43 into a set of definite clauses. By positive 
unfolding, negative unfolding, and replacement, from clause 43 we derive: 

47. new3(A,B, []) ^ 

48. new?,{A, B, \c\L\) ^ B>C A A<C A list{L) A B>C A ~^new2{C, B, L) 

49. new3{A, B, [C\B\) ^ B>C A A>C A list\L) A B> A A ^new2{A, B, L) 

In order to transform clauses 48 and 49 into definite clauses, we introduce the 
following definition: 

50. new4{A, B, L) ^ list{L) A B>A A ~^new2{A, B, L) 
and we fold clauses 48 and 49 using clause 50. We get: 

51. new‘i{A^B,[]) ^ B>A 

52. new^iA, B, [C\L]) ^ B>C A A<C A new4(C', B, L) 

53. new3lA,B,[C\L]) ^ B>C A A>C A new4fA, B, L) 
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Now we are left with the task of transforming clause 50 into a set of definite 
clauses. By applying the positive unfolding, negative unfolding, replacement, 
and folding rules, we derive: 

54. newA{A,B,[]) ^ B>A 

55. newA{A, B,[C\L]) ^ B<C f\C>Af\ new3{A, B, L) 

56. new4{A, B, [ClL]) ^ B>C f\C>Af\ new3{A, C, L) 

Finally, by eliminating the definitions of the predicates on which r does not 
depend, we get, as desired, the following final program which is a deterministic, 
definite program. 

EvenOdddet-’ 

44. r([]) ^ 

45. r([A]) ^ 

46. r{[A, B\L]) ^ B >A A new3{A, B, L) 

51. new3{A, B , []) ^ B > A 

52. new3{A, B, [C\L]) ^ B>C A A<C A newA{C, B, L) 

53. new3{A,B, [C\L]) ^ B>C A A>C A newA{A, B, L) 

54. newA{A,B, []) ^ B>A 

55. newA{A, B,[C\L]) ^ B<C AC>AA new3{A, B, L) 

56. newA{A, B, [ClL]) ^ B>C A C> A A new3{A, C, L) 

Given a list of numbers L of length n, the EvenOdddet program checks that r{L) 
holds by performing at most 2n comparisons between numbers occurring in L. 
Program EvenOdddet works by traversing the input list L only once (without 
backtracking) and storing, for every initial portion L\ of the input list L, the 
maximum number A occurring in an odd position of Li and the minimum num- 
ber B occurring in an even position of Li (see the first two arguments of the 
predicates new3 and newA) . When looking at the first element C of the portion 
of the input list still to be visited (i.e., the third argument of new3 or newA), 
the following two cases are possible: either (Case 1) the element C occurs in an 
odd position of the input list L, i.e., a call of the form new3{A, B, [(71^2]) is 
executed, or (Case 2) the element C occurs in an even position of the input list 
L, i.e., a call of the form newA{A, B, [(71X2]) is executed. In Case (1) program 
EvenOdddet checks that B>C holds and then updates the value of the maxi- 
mum number occurring in an odd position with the maximum between A and C. 
In Case (2) program EvenOdddet checks that C>A holds and then updates the 
value of the minimum number occurring in an even position with the minimum 
between B and C. 



5.2 Program Synthesis: The N-queens Problem 

The Wqueens problem has been often considered in the literature for present- 
ing various programming techniques, such as recursion and backtracking. We 
consider it here as an example of the program synthesis technique, as it has 
been done in [41]. Our derivation is different from the one presented in [41], be- 
cause the derivation in [41] makes use of the unfold/fold transformation rules for 
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definite programs together with an ad hoc transformation rule (called negation 
technique) for transforming general programs (with negation) into definite pro- 
grams. In contrast, we use unfold/fold transformation rules for general programs, 
and in particular, our negative unfolding rule of Section 3. 

The A^-queens problem can be informally specified as follows. We are required 
to place A^(>0) queens on an x fV chess board, so that no two queens attack 
each other, that is, they do not lie on the same row, column, or diagonal. A 
board configuration with this property is said to be safe. By using the fact that 
no two queens should lie on the same row, we represent an N x N chess board 
as a list L of N positive integers: the fc-th element on L represents the column 
of the queen on row k. 

In order to give a formal specification of the 7V-queens problem we follow the 
approach presented in [32], which is based on first order logic. We introduce the 
following constraint logic program: 

P : nat{0) ^ 

nat{N) ^ N = M+1 A M>0 A nat{M) 
nat-list{[]) ^ 

natJist([H\T\) ^ nat{H) A nat-list{T) 
length {[],0) ^ 

length([H\T],N) ^ N = M +l A M >0 A length(T, M) 
member{X, [H\T]) ^ X = H 
memher\x, [i^i'^]) ^ member{X,T) 
in-range{X, M, N) ^ X = N A M <N 

in-range{X, M, N) ^ N = K+1 A M <K A in-range{X, M, K) 
occurs {X, I, [H\T])^I=1AX = H 
occurs{X, I+l, [H\T]) ^ J> 1 A occurs{X, /, T) 

and the following first order formula: 

ip{N,L) : nat{N) A nat-list{L)A (1) 

length{L, N) A VA {member{X, L) in-range{X, 1, N))A (2) 

yA,B,M,N ((1<M A M<N Aoccurs{A, M, L) A occurs (B,N,L)) (3) 
^ {A^B AA-B^N-M AB-A^N-M)) (4) 

In the above program and formula in-range{X, M, N) holds iff A G {M, M+1, 
. . . , N} and A> 0. The other predicates have been defined in previous programs 
or do not require explanation. Now we define the relation queens (N,L) where 
A is a nonnegative integer and T is a list of positive integers, as follows: 

queens{N,L) iff M {P) \= ip{N , L) 

Line (2) of the formula (p{N,L) above specifies a chess board as a list of N 
integers each of which is in the range [l,...,Aj. If A = 0 the list is empty. 
Lines (3) and (4) of (f{N, L) specify the safety property of board configurations. 
Now, we would like to derive a constraint logic program R which computes the 
relation queens{N,L), that is, R should define a predicate queens{N, L) such 
that: 

(tt) M{R) 1= queens{N, L) iff M{P) \=(p{N^L) 
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Following the approach presented in [32], we start from the formula (called a 
statement) queens{N,L) ^ (p{N,L) and, by applying a variant of the Lloyd- 
Topor transformation [26], we derive the following stratified logic program: 

F : 1. queens{N, L) ^ nat{N) A nat-list{L) A length{L, N)A 
~^auxl{L, N) A ~^aux2{L) 

2. auxl{L,N) ^ member{X, L) A^in-range{X,l,N) 

3. aux2{L) ^ 1<K A K <M A 

AA-B^M-K AB-A^M-K)A 
occurs(A, K, L) A occurs{B, M, L) 

This variant of the Lloyd-Topor transformation is a fully automatic transforma- 
tion, but it cannot be performed by using our transformation rules, because it 
operates on first order formulas. It can be shown that this variant of the Lloyd- 
Topor transformation preserves the perfect model semantics and, thus, we have 
that: M{P U F) \= queens{N, L) iff M{P) \= ip{N, L). 

The derived program P U F is not very satisfactory from a computational 
point of view because, when using SLDNF resolution with the left-to-right se- 
lection rule, it may not terminate for calls of the form queens {n, L) where n is a 
nonnegative integer and L is a variable. Thus, the process of program synthesis 
proceeds by applying the transformation rules listed in Section 3, thereby trans- 
forming program PUF into a program R such that: (i) Property (tt) holds, (ii) R 
is a definite program, and (iii) R terminates for all calls of the form queens {n, L), 
where n is any nonnegative integer and L is a variable. Actually, the derivation of 
the final program R is performed by constructing two transformation sequences: 
(i) a first one, which starts from the initial program P, introduces clauses 2 and 
3 by definition introduction, and ends with a program Q, and (ii) a second one, 
which starts from program Q, introduces clause 1 by definition introduction, and 
ends with program R. 

We will illustrate the application of the transformation rules for deriving 
program R without discussing in detail how this derivation can be performed in 
an automatic way using a particular strategy. As already mentioned, the design 
of suitable transformation strategies for the automation of program derivations 
for constraint logic programs, is beyond the scope of the present paper. 

The program transformation process starts off from program P U {2, 3} by 
transforming clauses 2 and 3 into a set of clauses without local variables, so that 
they can be subsequently used for unfolding clause 1 w.r.t. ^auxl{L, N) and 
~^aux2{L) (see the negative unfolding rule R4). 

By positive unfolding, replacement, and positive folding, from clause 2 we 
derive: 

4. auxl{[Fl\T\, N) <— -^in-range{X, 1, N) 

5. auxl{[H\T], N) ^ auxl{T, N) 

Similarly, by positive unfolding, replacement, and positive folding, from clause 
3 we derive: 

6. aux2{[A\T]) ^ M>1 A^{A^B A A-B^M A B-A^M)A 

occur s{B, M, T) 

7. aux2{[A\T]) ^ aux2(T) 
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In order to eliminate the local variables B and M occurring in clause 6, by the 
definition introduction rule we introduce the following new clause, whose body 
is a generalization of the body of clause 6: 

8. newl{A,T,K) ^ M>1 A^{A^B A A-B^M+K A B-A^M+K)A 

occur s{B, M, T) 

By replacement and positive folding, from clause 6 we derive: 

6f. aux2{[A\T]) ^ newl{A^T,Q) 

Now, by positive unfolding, replacement, and positive folding, from clause 8 we 
derive: 

9. newl{A,[B\T],K) ^ -^{A^B A A-B^K+l A B-A^K+l) 

10. newl\A, [B\j\,K) ^ newl{A,T,K+l) 

The program, call it Q, derived so far is P U {4, 5, 6f, 7, 9, 10}, and clauses 4, 5, 
6f, 7, 9, and 10 have no local variables. 

Now we construct a new transformation sequence which takes Q as initial 
program. We start off by applying the definition introduction rule and adding 
clause 1 to program Q. Our objective is to transform clause 1 into a set of definite 
clauses. We first apply the definition rule and we introduce the following clause, 
whose body is a generalization of the body of clause 1: 

11. new2{N, L, K) ^ nat{M) A nat-list{L) A length{L, M)A 

~^auxl{L, N) A ^aux2{L) A N = M+K 

By replacement and positive folding, from clause 11 we derive: 

If. queens {N, L) ^ new2{N, L,0) 

By positive and negative unfolding, replacement, constraint addition, and posi- 
tive folding, from clause 11 we derive: 

12. new2{N,[],K) ^ N = K 

13. new2{N,[H\T],K) ^ N > K +1 A new2{N,T, K +1)A 

nat{H) A nat -list{T) A in-range{H, 1, N)A 
-^newl{H, T, 0) 

In order to derive a definite program we introduce a new predicate new 3 defined 
by the following clause: 

14. new3{A, T, N, M) <— nat{A) A nat-list{T) A in-range{A, 1, N)A 

~^new\{A, T, M) 

We fold clause 13 using clause 14 and we derive the following definite clause: 

13f. new2{N,[H\T],K) ^ N > K +1 A new2{N,T, K +1) A new3{H,T, N,0) 

By positive and negative unfolding, replacement, and positive folding, from 
clause 14 we derive the following definite clauses: 

15. new3{A, [], IV, M) <— in-range{A, 1, N) A nat{A) 

16. new3{A, [B\T],N,M) ^ Ay^B AA-Bj^M+IAB-Aj^M+IA 

nat{B) A new3{A,T, N, M+1) 

Finally, by assuming that the set of predicates of interest is the singleton {(/weens}, 
by definition elimination we derive the following program: 
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R: If. queens{N, L) ^ new2{N, L,0) 

12. new2{N,[],K) ^ N = K 

13f. new2{N, [H\T],K) ^ N > K +1 A new2{N,T, K +1) A new5{H,T, N,0) 

15. newS{A, [], N, M) ^ in-range{A, 1, N) A nat{A) 

16. new3{A, [B\T],N,M) ^ A^B A A-B^M+l A B-A^M+1 A 

nat{B) A new5{A,T, N,M+1) 

together with the clauses for the predicates in -range and nat. 

Program i? is a definite program and, by Theorem 3, we have that M{R) |= 
queens{N, L) iS M{PUFUDefs) |= grteens (TV, L), where fUHe/s is the set of all 
clauses introduced by the definition introduction rule during the transformation 
sequences from P to R. Since queens does not depend on Defs in P U F U Defs, 
we have that M{R) \= queens{N, L) iff M{P U F) |= queens{N, L) and, thus. 
Property (tt) holds. Moreover, it can be shown that R terminates for all calls of 
the form queens (n,L), where n is any nonnegative integer and L is a, variable. 

Notice that program R computes a solution of the F-queens problem in a 
clever way: each time a queen is placed on the board, program R checks that it 
does not attack any other queen already placed on the board. 

5.3 Program Specialization: Derivation of Counter Machines from 
Constrained Regular Expressions 

Given a set Af of variables ranging over natural numbers, a set C of constraints 
over natural numbers, and a set K of identifiers, we define a constrained regular 
expression e over the alphabet {a, 6 } as follows: 
e ::= a | 6 | ei • 62 | ei + C 2 | e'^N \ not{e) \ k 
where N G Af and k G K. An identifier k G K is defined by a definition of 
the form k = (c : e), where c G C and e is a constrained regular expression. 
For instance, the set {a™ 6 ” | m = n > 0} of strings in {a, b}* is denoted by the 
identifier k which is defined by the following definition: 
k = {M = N : {a^M- b^N)). 

Obviously, constrained regular expressions may denote languages which are not 
regular. 

Given a string S and a constrained regular expression e, the following locally 
stratified program P checks whether or not S belongs to the language denoted 
by e. We assume that constraints are definable as conjunctions of equalities and 
disequalities over natural numbers. 

P : string {[]) ^ 

string {[a\S]) ^ string (S) 
string{[b\S]) ^ string{S) 
symbol (a) ^ 
symbol (b) ^ 
app{[],L,L) ^ 

app{[A\X],Y, [A\Z]) ^ app{X, Y, Z) 
in -language {[A], A) ^ symbol (A) 
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iri-language{S, (E1-E2)) ^ app{Sl, S2, S)A 

in -language {SI, El) A in-language{S2, E2) 
in-language{S, E1 + E2) <— in-language{S, El) 
in-language{S, E1 + E2) <— in-language{S, E2) 
in-language{S, not{E)) ^ in-language{S, E) 
in-language{[], E^I) <— 1 = 0 

in-language{S, E^I) <— J=J+1AJ>0A app{Sl, S2, S') A 

in -language {SI, E) A in-language{S2, E^J) 
in-language{S,K) ^ {K = {C:E)) A solve{C) A in -language {S, E) 
solve{X = Y) ^ X = Y 
solve{X>Y) ^ X>Y 
solve{Ci A C 2 ) ^ solve{Ci) A solve{C 2 ) 

For example, in order to check whether a string S does not belong to the language 
denoted by k, where k is defined by the following definition: k = {M = N : 
{a^M ■ b^N)), we add to program P the clause: 

{k = {M = N : {a^M- b^N))) ^ 
and we evaluate a query of the form: 

string {S) A in-language{S, not{k)) 

Now, if we want to specialize program P w.r.t. this query, we introduce the new 
definition: 

1. newl{S) <— string{S) A in-language{S, not{k)) 

By unfolding clause 1 we get: 

2. newl{S) <— string{S) A ^ in-language{S, k) 

We cannot perform the negative unfolding of clause 2 w.r.t. ^ in-language{S, k) 
because of the local variables in the clauses for in-language{S,k). In order to 
derive a predicate which is equivalent to in-language{S,k) and is defined by 
clauses without local variables, we introduce the following clause: 

3. new2{S) ^ in-language{S,k) 

By unfolding clause 3 we get: 

4. new2{S) ^ M = N A app{Sl, S2, S)A 

in -language {Sl,a^M) A in-language{S2,b^N) 

We generalize clause 4 and we introduce the following clause 5: 

5. new3{S, I) ^ M = N+I A app{Sl, S2, S)A 

in -language {Sl,a^M) A in-language{S2,b^N) 

By unfolding clause 5, performing replacements based on laws of constraints, 
and folding, we get: 

6. new3{S, N) ^ in-language{S,b^N) 

7. neu'3([a|S'], TV) <— new3{S, N+1) 

In order to fold clause 6 we introduce the following definition: 

8. newA{S, N) <— in-language{S,b^N) 
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By unfolding clause 8, performing some replacements based on laws of con- 
straints, and folding, we get: 

9. new;4([], 0) ^ 

10. new;4(|6|S'], TV) ^ TV > 1 A new;4(S', TV— 1) 

By negative folding of clause 2 and positive folding of clauses 4 and 6 we get the 
following program: 

2f. newl{S) <— string(S) A ^ new2{S) 

4f. new2{S) ^ new3{S,0) 

6f. new3{S, N) ^ new4:{S, N) 

7. neuiSQalS”], TV) ^ new3{S, N +1) 

9. new;4([], 0) ^ 

10. new;4([6|S'], TV) <— TV > 1 A new;4(S', TV— 1) 

Now from clause 2f, by positive and negative unfoldings, replacements based on 
laws of constraints, and folding, we get: 

11. new;l([a|S']) <— string{S) A ^ newS{S, 1) 

12. neu'l([6|S']) ^ string{S) 

In order to fold clause 11 we introduce the following definition: 

13. new5{S, TV) ^ string{S) A ^ new3{S, TV) 

By positive and negative unfolding and folding we get: 

14. new5{[], N) ^ 

15. neuiSQalS”], TV) ^ new5{S, N +1) 

16. neu'5([ajs'], TV) ^ string{S) A ^ TV> 1 

17. new;5([6|S'], TV) ^ string{S) A ^ new4(S', TV— 1) 

In order to fold clause 17 we introduce the following definition: 

18. new6{S, TV) ^ string(S) A ^ new4{S, TV) 

Now, starting from clause 18, by positive and negative unfolding, replacements 
based on laws of constraints, folding, and elimination of the predicates on which 
newl does not depend, we get the following final, specialized program: 

Pspec ■ Ilf- newl([a|S']) ^ new5{S,l) 

12. newl( 6IS']) ^ string(S) 

14. new5([],TV)^ 

15. new5([a|S'], TV) ^ new;5(S', TV-l-1) 

16. new5{[b\S],0) ^ string (S) 

17f. new5([6|S'], TV) ^ new;6(S', TV— 1) 

19. new6([],TV) ^ TV yf 0 

20. new6{[a\S], N) ^ string{S) 

21. new6{[b\S],0) ^ string (S) 

22. new6([6|S'], TV) ^ new;6(S', TV— 1) 

This specialized program corresponds to a one-counter machine (that is, a push- 
down automaton where the stack alphabet contains one letter only [5]) and it 
takes 0{n) time to test that a string of length n does not belong to the language 
I TO = n > 0}. 
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6 Related Work and Conclusions 

During the last two decades various sets of unfold/fold transformation rules 
have been proposed for different classes of logic programs. The authors who first 
introduced the unfold/fold rules for logic programs were Tamaki and Sato in 
their seminal paper [44]. That paper presents a set of rules for transforming 
definite logic programs and it also presents the proof that those rules are correct 
w.r.t. the least Herbrand model semantics. Most of the subsequent papers in the 
field have followed Tamaki and Sato’s approach in that: (i) the various sets of 
rules which have been published can be seen as extensions or variants of Tamaki 
and Sato’s rules, and (ii) the techniques used for proving the correctness of the 
rules are similar to those used by Tamaki and Sato (the reader may look at 
the references given later in this section, and also at [29] for a survey). In the 
present paper we ourselves have followed Tamaki and Sato’s approach, but we 
have considered the more complex framework of locally stratified constraint logic 
programs with the perfect model semantics. 

Among the rules we have presented, the following ones were initially intro- 
duced in [44] (in the case of definite logic programs): (Rl) definition introduction, 
restricted to one clause only (that is, with m=l), (R3) positive unfolding, (R5) 
positive folding, restricted to one clause only (that is, with m=l). Our rules of 
replacement, deletion of useless predicates, constraint addition, and constraint 
deletion (that is, rules R7, R8, R9, and RIO, respectively) are extensions to 
the case of constraint logic programs with negation of the goal replacement and 
clause addition /deletion rules presented in [44]. In comparing the rules in [44] 
and the corresponding rules we have proposed, let us highlight also the following 
important difference. The goal replacement and clause addition/deletion of [44] 
are very general, but their applicability conditions are based on properties of 
the least Herbrand model and properties of the proof trees (such as goal equiv- 
alence or clause implication) which, in general, are very difficult to prove. On 
the contrary, (i) the applicability conditions of our replacement rule require the 
verification of (usually decidable) properties of the constraints, (ii) the property 
of being a useless predicate is decidable, because it refers to predicate symbols 
only (and not to the value of their arguments), and (iii) the applicability con- 
ditions for constraint addition and constraint deletion can be verified in most 
cases by program analysis techniques based on abstract interpretation [10]. 

For the correctness theorem (see Theorem 3) relative to admissible transfor- 
mation sequences we have followed Tamaki and Sato’s approach, and as in [44], 
the correctness is ensured by assuming the validity of some suitable conditions 
on the construction of the transformation sequences. 

Let us now relate our work here to that of other authors who have extended 
in several ways the work by Tamaki and Sato and, in particular, those who have 
extended it to the cases of: (i) general logic programs, and (ii) constraint logic 
programs. 

Tamaki and Sato’s unfolding and folding rules have been extended to gen- 
eral logic programs (without constraints) by Seki. He proved his extended rules 
correct w.r.t. various semantics, including the perfect model semantics [42,43]. 
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Building upon previous work for definite logic programs reported in [17,22,38], 
paper [37] extended Seki’s folding rule by allowing: (i) multiple folding, that is, 
one can fold m (> 1) clauses at a time using a definition consisting of m clauses, 
and (ii) recursive folding, that is, the definition used for folding can contain 
recursive clauses. 

Multiple folding can be performed by applying our rule R5, but recursive 
folding cannot. Indeed, by rule R5 we can fold using a definition introduced by 
rule Rl, and this rule does not allow the introduction of recursive clauses. Thus, 
in this respect the folding rule presented in this paper is less powerful than the 
folding rule considered in [37]. On the other hand, the set of rules presented 
here is more powerful than the one in [37] because it includes negative unfolding 
(R4) and negative folding (R6). These two rules are very useful in practice, and 
both are needed for the program derivation examples we have given in Section 5. 
They are also needed in the many examples of program verification presented 
in [13]. For reasons of simplicity, we have presented our non-recursive version of 
the positive folding rule because it has much simpler applicability conditions. In 
particular, the notion of admissible transformation sequence is much simpler for 
non-recursive folding. We leave for future research the problem of studying the 
correctness of a set of transformation rules which includes positive and negative 
unfolding, as well as recursive positive folding and recursive negative folding. 

Negative unfolding and negative folding were also considered in our previous 
work [32]. The present paper extends the transformation rules presented in [32] 
by adapting them to a logic language with constraints. Moreover, in [32] we did 
not present the proof of correctness of the transformation rules and we only 
showed some applications of our transformation rules to theorem proving and 
program synthesis. 

In [40] Sato proposed a set of transformation rules for first order programs, 
that is, for a logic language that extends general logic programs by allowing 
arbitrary first order formulas in the bodies of the clauses. However, the semantics 
considered in [40] is based on a three valued logic with the three truth values 
true, false, and undefined (corresponding to non terminating computations). 
Thus, the results presented in [40] cannot be directly compared with ours. In 
particular, for instance, the rule for eliminating useless predicates (R8) does 
not preserve the three valued semantics proposed in [40], because this rule may 
transform a program that does not terminate for a given query, into a program 
that terminates for that query. Moreover, the conditions for the applicability 
of the folding rule given in [40] are based on the chosen three valued logic and 
cannot be compared with those presented in this paper. 

Various other sets of transformation rules for general logic programs (in- 
cluding several variants of the goal replacement rule) have been proved correct 
w.r.t. other semantics, such as, the operational semantics based on SLDNF res- 
olution [16,42], Clark’s completion [16], and Kunen’s and Fitting’s three valued 
extensions of Clark’s completion [8]. We will not enter into a detailed compari- 
son with these works here. It will suffice to say that these works are not directly 
comparable with ours because of the different set of rules (in particular, none of 
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these works considers the negative unfolding rule) and the different semantics 
considered. 

The unfold/fold transformation rules have also been extended to constraint 
logic programs in [7,11,12,27]. Papers [7,11] deal with definite programs, while 
[27] considers locally stratified programs and proves that, with suitable restric- 
tions, the unfolding and folding rules preserve the perfect model semantics. Our 
correctness result presented here extends that in [27] because: (i) the rules of [27] 
include neither negative unfolding nor negative folding, and (ii) the folding rule 
of [27] is reversible, that is, it can only be applied for folding a set of clauses in 
a program P by using a set of clauses that occur in P. As already mentioned 
in Section 3, our folding rule is not reversible, because we may fold clauses 
in program P^ of a transformation sequence by using definitions occurring in 
DefS)., but possibly not in Pk- Reversibility is a very strong limitation, because 
it does not allow the derivation of recursive clauses from non-recursive clauses. 
In particular, the derivations presented in our examples of Section 5 could not 
be performed by using the reversible folding rule of [27]. 

Finally, [12] proposes a set of transformation rules for locally stratified con- 
straint logic programs tailored to a specific task, namely, program specialization 
and its application to the verification of infinite state reactive systems. Due to 
their specific application, the transformation rules of [12] are much more re- 
stricted than the ones presented here. In particular, by using the rules of [12]: 
(i) we can only introduce constrained atomic definitions, that is, definitions that 
consist of single clauses whose body is a constrained atom, (ii) we can unfold 
clauses w.r.t. a negated atom only if that atom succeeds or fails in one step, and 
(iii) we can apply the positive and negative folding rules by using constrained 
atomic definitions only. 

We envisage several lines for further development of the work presented in 
this paper. As a first step forward, one could design strategies for automating 
the application of the transformation rules proposed here. In our examples of 
Section 5 we have demonstrated that some strategies already considered in the 
literature for the case of definite programs, can be extended to general constraint 
logic programs. This extension can be done, in particular, for the following strate- 
gies: (i) the elimination of local variables [34], (ii) the derivation of deterministic 
programs [33], and (iii) the rule-based program specialization [24]. 

It has been pointed out by recent studies that there is a strict relationship be- 
tween program transformation and various other methodologies for program de- 
velopment and software verification (see, for instance, [13,15,25,30,31,36]). Thus, 
strategies for the automatic application of transformation rules can be exploited 
in the design of automatic techniques in these related fields and, in particu- 
lar, in program synthesis and theorem proving. We believe that transformation 
methodologies for logic and constraint languages can form the basis of a very 
powerful framework for machine assisted software development. 
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7 Appendices 

7.1 Appendix A 

In this Appendix A we will use the fact that, given any two atoms A and B, and 
any valuation v, if a{v{A)) > a{v{B)) then for every substitution i}, a{v{A'd)) > 
a{v{B'&)). The same holds with >, instead of >. 

Proof of Proposition 1. [Preservation of Local Stratification]. We will prove that, 
for fc = 0, . . . , n, Pfc is locally stratified w.r.t. a by induction on k. 

Base case (fc = 0). By hypothesis Pq is locally stratified w.r.t. a. 

Induction step. We assume that Pk is locally stratified w.r.t. cr and we show 
that Pk+i is locally stratified w.r.t. cr. We proceed by cases depending on the 
transformation rule which is applied to derive Pk+i from Pk- 
Case 1. Program Pk+i is derived by definition introduction (rule Rl). We have 
that Pk+i = Pfc U {(5i, . . . , 6m}, where Pk is locally stratified w.r.t. a by the in- 
ductive hypothesis and {<5i, . . . , Sm} is locally stratified w.r.t. a by Condition (iv) 
of Rl. Thus, Pfe+i is locally stratified w.r.t. a. 

Case 2. Program Pfc+i is derived by definition elimination (rule R2). Then Pfc+i 
is locally stratified w.r.t. a because Pfe+i C Pfc. 

Case 3. Program Pk+i is derived by positive unfolding (rule R3). We have that 
Pfc+i = (Pfc — {7}) U { 771 , . . . , rjm}, where 7 is a clause in Pk of the form H ^ 
c A Gl a a a Gr and clauses 771 , ... , rjm are derived by unfolding 7 w.r.t. A. 
Since, by the induction hypothesis, (Pfc — {7}) is locally stratified w.r.t. cr, it 
remains to show that, for every valuation v, for i = l,...,m, clause v(jii) is 
locally stratified w.r.t. cr. Take any valuation v. For i = 1, . . . ,m, there exists 
a clause 7 * in a variant of Pfc of the form Ki <— ci f\ Bi such that rji is of 
the form H <— c A A = Ki A Ci A Gl /\ Bi A Gr. By the inductive hypothesis, 
v{H ^ c A Gl a a a Gr) and v{Ki ^ Ci A Bf) are locally stratified w.r.t. a. We 
consider two cases: (a) V ^ ^v{c A A = Ki A Ci) and (b) T> ^ v{cA A = Ki A c,). 
In Case (a), v{rji) is locally stratified w.r.t. cr by definition. In Case (b), we have 
that: (i) V ^ v(c), (ii) V |= v(A) = v(Ki), and (iii) V |= v(ci). Let us consider a 
literal v(L) occurring in the body of v(j]i). If v{L) is an atom occurring positively 
in v{GlAGr) then a{v{H))>a{v{L)) because v{H <— cAGlAAaGr) is locally 
stratified w.r.t. a and T> |= v{c). Similarly, if v{L) is a negated atom occurring 
in v(Gl a Gr) then a(v(H)) >a(v(L)). If v(L) is an atom occurring positively 
in v{Bi) then a{v{H))>a{v{L)). Indeed: 
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a{v{H)) > a{v{A)) (because v{H ^ c A Gl A A A Gr) is locally stratified 
w.r.t. a and T> |= v{c)) 

= a{v{Ki)) (because v{A)=v{Ki)) 

>(j{v{L)) (because v{Ki <— Ci A Bi) is locally stratified w.r.t. cr 
and V \= v{ci)) 

Similarly, if v{L) is a negated atom occurring in v{B) then a{v{H)) >a{v{L)). 
Thus, the clause v{rii) is locally stratified w.r.t. a. 

Case 4. Program is derived by negative unfolding (rule R4). As in Case 3, 
we have that Pk+i = {Pk — { 7 }) U { 771 , . . . ,r]s}, where 7 is a clause in Pk of 
the form H ^ c A Gl ~^A A Gr and clauses rji, ... ,rjs are derived by negative 
unfolding 7 w.r.t. ^A. Since, by the induction hypothesis, {Pk — { 7 }) is locally 
stratified w.r.t. a, it remains to show that, for every valuation v, for j = 1 , . . . , s, 
clause v{r]j) is locally stratified w.r.t. cr. Take any valuation v. Let Ki ^ c\ A 
Bi, ... , Km ^ Cm A Bm be the clauses in a variant of Pk such that, for i = 
1, . . . , m, T> 1= 3{cAA = KiAci). Then, we have that, for j = 1, . . . , s, the clause 
v{rjj) is of the form v{H ^ cAej A Gl A Qj A Gr), where v{Qj) is a conjunction 
of literals. By the applicability conditions of the negative unfolding rule and by 
construction (see Steps 1-4 of R4), we have that there exist m substitutions 
7 ?i, . . . , -dm such that the following two properties hold: 

(P.l) for every literal v{L) occurring in v{Qj) there exists a (positive or negative) 
literal v{M) occurring in v{Bi'di) for some i € {l,...,m}, such that v(L) is 
v(M), and 

(P.2) if v(L) occurs in v{Qj) and v{L) is v{M) with v{M) occurring in v{Bidi) 
for some z G {1, . . . , m}, then V |= v{{c A ej) {A = Ki'di A Cidi)). 

We will show that v{r]j) is locally stratified w.r.t. cr. By the inductive hypothesis, 
we have that v{H ^ c A Gl A ~^A A Gr) and v{Ki'di <— Ci'di A Bidi) are locally 
stratified w.r.t. a. 

We consider two cases: (a) V |= ~^v{c A ej) and (b) T> |= v{c A ej). In Case (a), 
v{rjj) is locally stratified w.r.t. a by definition. In Case (b), take any literal v{L) 
occurring in v{Qj). By Properties (P.l) and (P.2), v{L) is v{M) for some v{M) 
occurring in v{Bi). We also have that: (i) T> |= u(A) = v(Ki'di) and (ii) T> |= 
v(ci'di). Moreover T> |= v{c), because we are in Case (b). Now, if v{M) is a 
positive literal occurring in v{Bi) we have: 

a{v{H)) > a{v{A)) (because v{Pl ^ c A Gl A ^A A Gr) is locally stratified 
w.r.t. cr and V ^ c(c)) 

= a(v(Kii9i)) (because v{A)=v{Ki'di)) 

(t) >cr{v{M)) (because v{Ki'di ^ Ci'di A Bi'di) is locally stratified 

w.r.t. cr and T> |= v(cii9i)). 

Thus, we get: a{v{H)) > a{v{M)), and we conclude that v{rjj) is locally stratified 
w.r.t. cr. Similarly, if v{M) is a negative literal occurring in v{Bi'di), we also get: 
a{v{P[)) > a{v{M)). (In particu lar, if v{M) is a negative literal, at Point (f) 
above, we have a{v{Ki'di)) > a{v{M)).) Thus, we also conclude that v{rij) is 
locally stratified w.r.t. cr. 
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Case 5. Program Pk+i is derived by positive folding (rule R5). For reasons of 
simplicity, we assume that we fold one clause only, that is, m = 1 in rule R5. The 
general case where m > 1 is analogous. We have that Pk+i = {Pk — {7}) U {rj}, 
where 77 is a clause oftheformi? ^ cAGiAiCr^AGfl derived by positive folding 
of clause 7 of the form H ^ cAd'd AGl B'd A Gr using a clause 5 of the form 
K ^ d A B introduced by rule Rl. We have to show that, for every valuation 
V, v{H ^ c A Gl a Kd A Gr) is locally stratified w.r.t. cr. By the inductive 
hypothesis, we have that: (i) for every valuation v, v(j) is locally stratified 
w.r.t. a, and (ii) for every valuation v, v(S) is locally stratified w.r.t. cr. Take any 
valuation v. There are two cases: (a) T> |= ^v(c) and (b) T> ^ v(c). In Case (a), 
v(t]) is locally stratified w.r.t. a by definition. In Case (b), take any literal v(L) 
occurring in v{Bd). Now, either (bl) v{L) is a positive literal, or (b2) v{L) is 
a negative literal. In Case (bl) there are two subcases: (bl.I) T> ^ ~^v{d'd), and 
(bl.2) V \= v(d'd). In Case (bl.I) by Condition (iv) of rule Rl, aiv^Kd)) = 0 
and thus, a{v{H)) > a{v{K'd)). Hence, v{rj) is locally stratified w.r.t. a. In 
Case (bl.2), we have that T> |= v{c A dt)) and, by the inductive hypothesis, 
a{v{H)) > a{v{L'd)). Thus, a{v{H)) > a{v{Ki})), because by Condition (iv) of 
rule Rl, a{v{K'd)) is the smallest ordinal a such that a > a{v{L'd)). Thus, v{rj) 
is locally stratified w.r.t. a. 

Case (b2), when v{L) is a negative literal occurring in v{Bd), has a proof similar 
to the one of Case (bl), except that a{v{H)) > a{v{L'd)), instead of a{v{H)) > 
a{v{L'd)). 

Case 6 . Program Pk+i is derived by negative folding (rule R 6 ). We have that 
Pk+i = (Rfe — {7})U{?7}, where 77 is a clause of the form H ^ cAdd AG l A^Kd A 
Gr derived by negative folding of clause 7 of the form H ^ cAd’dAGrA^A'dAGR 
using a clause 6 of the form K ^ d A A introduced by rule Rl . We have to show 
that, for every valuation v, v(j]) is locally stratified w.r.t. a. By the inductive 
hypothesis, we have that: (i) for every valuation v, v{H ^ cAd’dAGrA^A'dAGR) 
is locally stratified w.r.t. a, and (ii) for every valuation v, v{K <— d A A) is 
locally stratified w.r.t. a. Take any valuation v. There are two cases: (a) T> |= 
^v{cA dd), and (b) T> |= v{cA dd). In Case (a), v{r]) is locally stratified w.r.t. cr 
by definition. In Case (b), by the inductive hypothesis, we have only to show 
that a{v{H)) > a{v{K'd)). Since T> ^ v{cA di}), by the inductive hypothesis we 
have that a{v{H)) > a{v{A'd)). By Condition (iv) of the rule Rl, we have that 
a{v{H)) > a{v{K'd)). Hence, v{rj) is locally stratified w.r.t. a. 

Case 7. Program Pfc+i is derived by replacement (rule R7). We have that Pk+i = 
{Pk — Pi) U P 2 , where {Pk — Pi) is locally stratified w.r.t. a by the inductive 
hypothesis and P 2 is locally stratified w.r.t. a by the applicability conditions of 
rule R7. Thus, Pk+i is locally stratified w.r.t. a. 

Case 8 . Program Pk+i is derived by deletion of useless clauses (rule R 8 ). Pfc+i 
is locally stratified w.r.t. a by the inductive hypothesis because Pfc+i C P^. 
Case 9. Program Pk+i is derived by constraint addition (rule R9). We have that 
Pfc+i = (Pfe — { 7 i})U{ 72 }, where 72 : P ^ cAdAG is the clause in Pk+i derived 
by constraint addition from the clause 71 : P <— cAG in Pj,. For every valuation 
V, v{H ^ c A d A G) is locally stratified w.r.t. cr because: (i) by the induction 
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hypothesis v{H ^ cAG) is locally stratified w.r.t. a and (ii) if I? |= v{cAd) then 
T> 1= v{c). Since, by the inductive hypothesis, (Pfc — {71}) is locally stratified 
w.r.t. cr, also Pk+i is locally stratified w.r.t. cr. 

Case 10. Program is derived by constraint deletion (rule RIO). We have 
that Pk+i = (Pfc — {71}) U {72}, where 72: i? ^ c A G is the clause in Pfc+i 
derived by constraint deletion from clause 71 : H ^ c A d A G in Pj, . By the 
applicability conditions of RIO, 7 is locally stratified w.r.t. cr. Since, by the 
inductive hypothesis, (Pfc — {71}) is locally stratified w.r.t. a, also Pk+i is locally 
stratified w.r.t. a. 

Finally, Pq U Defs„ is locally stratified w.r.t. a by the hypothesis that Pq is 
locally stratified w.r.t. a and by Condition (iv) of rule Rl. □ 

7.2 Appendix B 

In the proofs of Appendices B and C we use the following notions. Given a clause 
7: id <— c A Pi A . . . A Pm and a valuation v such that T> ^ u(c), we denote by 7„ 
the clause v{H ^ Pi A ... A Pm)- We define ground{'f) = {'jy | u is a valuation 
and V ^ r’(c)}. Given a set P of clauses, we define ground{P) = U7Gr gi"ound{j). 

Proof of Proposition 3. Recall that Pq, ... ,Pi is constructed by i{> 0) applica- 
tions of the definition rule, that is, Pi = Pq U Defs^, and Pi,...,Pj is constructed 
by applying once the positive unfolding rule to each clause in DefSi. Let a be 
the fixed stratification function considered at the beginning of the construction 
of the transformation sequence. By Proposition 1, each program in the sequence 
Pi, ... , Pj is locally stratified w.r.t. a. 

Let us consider a ground atom A. By complete induction on the ordinal cr(A) 
we prove that, for k = i, . . . , j— 1 , there exists a proof tree for A and Pk iff there 
exists a proof tree for A and Pfc+i. The inductive hypothesis is: 

(11) for every ground atom A , if cr(A') <a{A) then there exists a proof tree for 
A' and Pk iff there exists a proof tree for A' and Pfc+i. 

{If Part) We consider a proof tree U for A and Pk+i, and we show that we can 
construct a proof tree T for A and Pk. We proceed by complete induction on 
size{U). The inductive hypothesis is: 

(12) given any proof tree U\ for a ground atom A\ and Pfe+i, if size{Ui) < sizefU) 
then there exists a proof tree Pi for Ai and Pfc. 

Let 7 be a clause of Pfc+i and let 7„: A ^ Pi A ... A be the clause in 
ground{'-f) used at the root of U. Thus, Pi, . . . , P^ are the children of A in U. For 
/i = 1 , . . . , r, if Pfc is an atom then the subtree Uh of U rooted at Pfc is a proof 
tree for Pfc and Pk+i- Since size {U h) < size {U), by the inductive hypothesis (12) 
there exists a proof tree Th for Pfc and Pfc. For ft, = 1 , . . . , r, if Pfc is a negated 
atom ^Afc then, by the definition of proof tree, there exists no proof tree for Ah 
and Pfc+i. Since cr is a local stratification for Pfc+i, we have that cr(Afc) <a{A) 
and, by the inductive hypothesis (II) there exists no proof tree for Ah and Pfc. 
Now, we proceed by cases. 
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Case 1. 7 G Pk- We construct T as follows. The root of T is A. We use 7 „: 
A ^ Li A . . . A Lr to construct the children oi A. If r = 0 then true is the only 
child of A in T, and T is a proof tree for A and Pk- Otherwise r > 1 and, for 
h = 1, ... ,r, if Lh is an atom Ah then Th is the subtree of T at Ah, and if Lh 
is a negated atom then Lh is a leaf of T. By construction we have that T is a 
proof tree for A and Pk . 

Case 2 . 7 ^ Pfe and 7 G Pk+i because 7 is derived by positive unfolding. Thus, 
there exist: a clause a in Pk of the form H^cAGlAAsA Gr and a variant 
/? of a clause in Pk of the form K ^ d A B such that clause 7 is of the form 
H ^ cAAs = K AdAGhAB AG r. Thus, (i) v{H) = A, (ii) T> \= v{cAAs = K Ad), 
and (iii) v{Gl ABA Gr) = Li, . . . ,Lr. By (ii) we have that G ground(Pk) 
and Pv G ground(Pk). (Notice that, since /3 is a variant of a clause in Pk, then 
Pv G ground {Pk).) 

We construct T as follows. The root of T is A. We use to construct the 
children of A and then we use Py to construct the children of Ar. The leaves of 
the tree constructed in this way are Li, . . . , If r = 0 then true is the only leaf 
of T, and T is a proof tree for A and Pk. Otherwise r > 1 and, for h = 1, . . . , r, 
if Lh is an atom then Th is the subtree of T rooted at Lh, and if Lh is a negated 
atom then Lh is a leaf of T. By construction we have that T is a proof tree for 
A and Pk. 

{Only-if Part) We consider a proof tree T for a ground atom A and program Pk, 
for k = i, . . . j—1, and we show that we can construct a proof tree U for A and 
Pk+i. We proceed by complete induction on size{T). The inductive hypothesis 
is: 

(13) given any proof tree T\ for a ground atom A\ and Pk, if size{Ti) < size{T) 
then there exists a proof tree Lf\ for Ai and Pk+i. 

Let 7 be a clause of Pk and let 7 „: A ^ Li A . . . A Ly be the clause in 
ground{'-f) used at the root of T. Now we proceed by cases. 

Case 1. 7 G Pk+i. We construct the proof tree U for A and Pk+i as follows. We 
use 7 „ to construct the children Li,. . . ,Ly of the root A. If r = 0 then true is 
the only child of A in U, and U is a proof tree for A and Pk+i. Otherwise, r > 1 
and, for /i = 1, . . . , r, if is an atom, we consider the subtree Th of T rooted 
at Lh. We have that Th is a proof tree for Lh and Pk with size{Th) < size{T) 
and, therefore, by the inductive hypothesis (13), there exists a proof tree Lfh for 
Lh and Pk+i. For h = 1, . . . , r, if L/j is a negated atom ^Ah, then a{A)>a{Ah) 
because ct is a stratification function for Pk. Thus, by the inductive hypothesis 
(II) we have that there is no proof tree for Ah and Pfc+i- The construction of 
U continues as follows. For h = 1, . . . , r, ii Lh is an atom then we use Uh as a 
subtree of U rooted at Lh and, if Lh is a negated atom, then Lh is a leaf of C/. 
Thus, by construction we have that C7 is a proof tree for A and Pk+i. 

Case 2. 7 G Pfe and 7 ^ Pfc+i because 7 has been unfolded w.r.t. an atom in its 
body. Let us assume that 7 is of the form H ^ c A Gr A A$ A Gr and 7 has 
been unfolded w.r.t. Ar. We have that: (i) v{H) = A, (ii) T> ^ v{c), and (iii) the 
ground literals Li,. . . ,Ly such that Lx A ... A Ly = v{Gr A Ar A Gr) are the 
children of A in T. Let /?: iL <— dAP be the clause in Pk which has been used for 
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constructing the children of ^(^5) in T . Thus, there exists a valuation v' such 
that: (iv) u(As) = v'{K), (v) T> \= v'{d), and (vi) the literals in v'{B) are the 
children of f (^s) in T. Without loss of generality we may assume that 7 and f3 
have no variables in common and v = v' . Thus, the ground literals Mi, . . . , Mg 
such that Ml A ... A Mg = v{Gl ABA Gr) are descendants of A in T. For 
h = 1, . . . , s, if Mh is an atom, let us consider the subtree Th of T rooted at 
Mh- We have that Th is a proof tree for Mh and Pk with size{Th) < size{T) and, 
therefore, by the inductive hypothesis (13), there exists a proof tree Uh for Mh 
and Pfc+i- For ft, = 1, . . . , s, if Mh is a negated atom ^Ah then Mh is a leaf of T 
and there exists no proof tree for Ah and Since cr is a stratification function 
for Pfe, we have that a{A) > a{Ah) and thus, by the inductive hypothesis (II), 
there exists no proof tree for Ah and Pfc+i. 

Now let us consider the clause rj : H ^ c A As = K Ad A Gl ABA Gr. rj is 
one of the clauses derived by unfolding 7 because (3 £ Pk and, by (ii), (iv), (v) 
and the assumption that v = v' , we have that T> |= v{c A A$ = K A d) and hence 
V ^ 3(c A As = K Ad). Thus, we construct a proof tree U for A and Pfc+i as 
follows. Since A = v{H) and Mi A ... A Mg = v{Gl ABA Gr), we can use 
v{H ^ Gl aba Gr) to construct the children Mi, . . . , Mg of A in P. If s = 0 
then true is the only child of A in U, and P is a proof tree for A and Pk+i- 
Otherwise, s> 1 and, for ft = 1, . . . , s, if Mh is an atom then Uh is the proof tree 
rooted at Mh in P. If Mh is a negated atom then Mh is a leaf of P. The proof 
tree P is the proof tree for A and Pfc+i to be constructed. □ 

7.3 Appendix C 

Proof of Proposition 5. Recall that the transformation sequence Po,...,Pi, 

. . . ,Pj, . . . , Pm is constructed as follows (see Definition 3) : 

(1) the sequence Pq,. ■ ■ ,Pi, with z > 0, is constructed by applying i times the 
definition introduction rule, that is, Pi = PqU Defsp, 

(2) the sequence Pi, ... ,Pj is constructed by applying once the positive unfolding 
rule to each clause in DefSi which is used for applications of the folding rule in 
P P ■ 

^ ' 1 ^ mi 

(3) the sequence Pj, - . ■ ,Pm, with j < m, is constructed by applying any rule, 
except the definition introduction and definition elimination rules. 

Let a be the fixed stratification function considered at the beginning of the 
construction of the transformation sequence. By Proposition 1, each program in 
the sequence Pq U Pe/Sj , . . . ,Pj,. . . , Pm is locally stratified w.r.t. a. 

We will prove by induction on k that, for k = j, . . . ,m, 

(Soundness) if there exists a proof tree for a ground atom A and Pk then there 
exists a proof tree for A and Pj , and 

(Completeness) if there exists a P^-consistent proof tree for a ground atom A 
and Pj then there exists a P^ -consistent proof tree for A and Pk. 

The base case (k = j) is trivial. 

For proving the induction step, consider any k in {j, . . . , m—1}. We assume that 
the soundness and completeness properties hold for that k, and we prove that 
they hold for For the soundness property it is enough to prove that: 




Transformation Rules for Locally Stratified Constraint Logic Programs 333 



- if there exists a proof tree for a ground atom A and Pk+i then there exists a 
proof tree for A and Pk , 

and for the completeness property it is enough to prove that: 

- if there exists a Pj -consistent proof tree for a ground atom A and Pk then there 
exists a P^-consistent proof tree for A and Pfc+i. 

We proceed by complete induction on the ordinal a (A) associated with the 
ground atom A. The inductive hypotheses are: 

(IS) for every ground atom A! such that <j{A') < (t(A), if there exists a proof tree 
for A and Pfc+i then there exists a proof tree for A and Pk, and 

(IC) for every ground atom A such that cr{A') < (j{A), if there exists a 
Pj -consistent proof tree for A and Pk then there exists a Pj -consistent proof 
tree for A and Pfc+i- 

By the inductive hypotheses on soundness and completeness for k, (IS), (IC), 
and Proposition 4, we have that: 

(ISC) for every ground atom A such that a{A) <a{A), there exists a proof tree 
for A and Pk iff there exists a proof tree for A and Pk+i- 

Now we give the proofs for the soundness and the completeness properties. 

Proof of Soundness. Given a proof tree U for A and Pfc+i we have to prove that 
there exists a proof tree T for A and Pk- The proof is by complete induction on 
size{T). The inductive hypothesis is: 

(Isize) Given any proof tree U' for a ground atom A and Pfc+i, if size{U') < 
size{U) then there exists a proof tree T' for A and Pk- 

Let 7 be a clause in Pk+i and u be a valuation. Let G ground{'j) be the 
ground clause of the form A ^ Li A . . . A Lr used at the root of U. We proceed 
by considering the following cases: either (Case 1) 7 belongs to Pk or (Case 
2) 7 does not belong to Pk and it has been derived from some clauses in Pk 
by applying a transformation rule among R3, R4, R5, R6, R7, R9, RIO. (Recall 
that R1 and R2 are not applied in Pj,. . . , Pm, and by R8 we delete clauses.) 

The proof of Case 1 and the proofs of Case 2 for rules R3, R4, R9, and RIO 
are left to the reader. Now we present the proofs of Case 2 for rules R5, R6, and 
R7. 

Case 2, rule R5. Clause 7 is derived by positive folding. Let 7 be derived by 
folding clauses 71 , ... , 7^ in Pk using clauses i5i, . . . , <5^ where, for z = 1 , . . . , m, 
clause Si is of the form K ^ di A Bi and clause 7^ is of the form H ^ c A 
did AGlA Bid A Gr, for a substitution d satisfying Conditions (i) and (ii) given 
in (R5). Thus, 7 is of the form: H ^ c A Gr A Kd A Gr and we have that: 
(a) v{H) = A, (b) V 1= v(c), and (c) v(Gr A Kd A Gr) = Li A ... A L^. Since 
program Pfc-i-i is locally stratified w.r.t. a, by the inductive hypotheses (ISC) and 
(Isize) we have that: for h = 1, . . . , r, ii Lh is an atom then there exists a proof 
tree Th for Lh and Pk, and if Lh is a negated atom ^Ah then there is no proof 
tree for Ah and Pk- The atom v{Kd) is one of the literals Li, . . . , Lr, say Lf, and 
thus, there exists a proof tree for v{Kd) and Pk- By the inductive hypothesis 
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(Soundness) for Pk and Proposition 3, there exists a proof tree for v{Kd) and Pi. 
Since Pi = Pq LI Defs^ and Si,. .. ,Sm are all clauses in (a variant of) Pq U Defs^ 
which have the same predicate symbol as K, there exists 6p G Si,. .. ,Sm such 
that Sp is of the form K ^ dp A Bp and Sp is used to construct the children of 
v{K-d) in the proof tree for v{K-d) and Pi. By Conditions (i) and (ii) on S given 
in (R5), we have that: (d) V \= v{dp'd) and (e) v{Bp'd) = Mi A ... A Mg. By 
the definition of proof tree, for h = 1, . . . , s, if Mh is an atom then there exists 
a proof tree for Mh and Pi, else if Mh is a negated atom ~^Eh then there is no 
proof tree for Eh and Pi . By Propositions 3 and 4 and the inductive hypotheses 
(Soundness and Completeness) we have that, for ft, = 1, . . . , s, if Mh is an atom 
then there exists a proof tree Th for Mh and Pt, else if Mh is a negated atom 
-^Eh then there is no proof tree for Eh and Pk. 

Now we construct the proof tree T for A and Pk as follows. By (a), (b), 
and (d), we have that v{H) = A and T> |= v{c A dpd). Thus, we construct the 
children of A in T by using the clause El ^ c A dpd AGl A BpS A Gr. Since 
v{G LABpd AG r) = LiA. . .AL/_iAMiA. . .AMgATy+iA. . .ALr, the children of A 
in T are: Li, . . . , T/-i, Mi, . . . , Mg, T/+i, . . . , L^. By the applicability conditions 
of the positive folding rule, we have that s > 0 and A has a child different from 
the empty conjunction true. The children of A are constructed as follows. For 
ft = 1, . . . , r, if L/i is an atom then Th is the subtree of T rooted in Lh, else if 
Lh is^negated atom then Lh is a leaf of T. For ft = 1, . . . , s, if Mh is an atom 
then Th is the subtree of T rooted in Mh, else if Mh is a negated atom then Mh 
is a leaf of T. 

Case 2, rule R6. Clause 7 is derived by negative folding. Let 7 be derived by 
folding a clause a in Pk of the form H ^ cA Gl A ^ArD A Gr by using a clause 
S G DefSi of the form K ^ dAAp. Thus, 7 is of the form H ^ cAGlA^KDAGr. 

Let 7„ be of the form A ^ Li A . . . A L/-i A ~^v{K'd) A L/+i A ... A Lr, 
that is, v{H) = A and T> ^ f(c). By the conditions on the applicability of 
rule R6, we also have that V ^ v{d'd). Since program Pk+i is locally stratified 
w.r.t. a, we have that a{v{Kd)) < cr(A). By the definition of proof tree, there 
is no proof tree for v{K'd) and Pk+\. Thus, by hypothesis (ISC) there exists no 
proof tree for v{K'd) and Pk. By the inductive hypothesis (Completeness) and 
Propositions 3 and 4, there exists no proof tree for v{Kd) and Pq U DefSi and 
thus, since K ^ d A Ar is the only clause defining the head predicate of K and 
V ^ v{d'd), there is no proof tree for v{Ar'&) and Pq U Defs^ By Proposition 3 
and the inductive hypothesis (Soundness), there exists no proof tree for v(Ari9) 
and Pk. Since T> \= v(c) there exists a clause in ground{a) of the form 
A ^ Li A . . . ALf-i A ^v{Ar'0) A L/+i A ... A Lr. We begin the construction of 
T by using ay at the root. For all ft = 1, /+ 1 , . . . ,r such that Lh is an 

atom and Uh is the subtree of Lf rooted in Lh, we have that size{Uh) < size{U). 
By hypothesis (Isize) there exists a proof tree Th for Lh and Pk which we use as 
a subtree of T rooted in Lh. For all ft = 1, ...,/ — 1, /+ 1, . . . , r such that Lh is a 
negated atom ^Ah we have that a{Ah) < cr{A), because program Pk+i is locally 
stratified w.r.t. a. Moreover, there is no proof tree for Ah in Pk+i, because Lf is 
a proof tree. By hypothesis (ISC) we have that there is no proof tree for Ah in 
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Pfc. Thus, for all ft- = 1, . . . , / — 1, / + 1, . . . , r such that Lh is a negated atom 
we take Lh to be a leaf of T. 

Case2, rule R7. Clause 7 is derived by replacement. We only consider the case 
where Pk+i is derived from program Pk by applying the replacement rule based 
on law (8). The other cases are left to the reader. Suppose that a clause 7: 
H ^ Cl AG in Pk is replaced by clause -f: H ^ C2AG and T> \=y {3Y c\ 3Z C2), 
where: (i) Y = FV{ci)-Fv({H,G}) and (ii) Z = FV {c 2 )-FV {{H,G}). Thus, 
ground {"/) = ground{rj) and we can construct a proof tree for the ground atom 
A and Pk by using a clause in ground{rf), instead of a clause in ground (j). 

Proof of Completeness. Given a Pj-consistent proof tree for A and Pk, we prove 
that there exists a P^-consistent proof tree for A and Pfc+i. The proof is by 
well-founded induction on pL{A,Pj). The inductive hypothesis is: 

(I/x) for every ground atom A' such that p,{A',Pj) < p.{A,Pj), if there exists a 
Pj-consistent proof tree T' for A' and Pk then there exists a P^-consistent proof 
tree U' for A' and Pk+i- 

Let 7 be a clause in Pk and u be a valuation such that 7„ G ground {-j) is the 
ground clause of the form H ^ Pi A ... A used at the root of T. 

The proof proceeds by considering the following cases: either 7 belongs to 
Pfc+i or 7 does not belong to Pk+i because it has been replaced (together with 
other clauses in Pk) with new clauses derived by an application of a transforma- 
tion rule among R3, R4, R5, R6, R7, R8, R9, RIO (recall that R1 and R2 are 
not applied in Pj, , Pm). We present only the case where Pfc+i is derived from 
Pk by positive folding (rule R5). The other cases are similar and are left to the 
reader. 

Suppose that Pfc+i is derived from Pk by folding clauses 71 ,..., 7^ in Pk 
using clauses (5i, . . . , 5m in (a variant of) Defsk, and let 7 be 7^, with 1 < p < to. 
Suppose also that, for i = 1, . . . , to, clause Si is of the form K ^ di A Bi and 
clause 7i is of the form H ^ c A did A Gl A Bid A Gr, for a substitution d 
satisfying Conditions (i) and (ii) given in (R5). The clause ij derived by folding 
7i, • • ■ ) 7m using Si,. . . ,Sm is of the form: F[ ^ c A Gl A Kd A Gr. Since we 
use 7„ at the root of T, we have that: (a) v{H) = A, (b) T> ^ v{c A dpd), 
and (c) v{Gl A Bpd A Gr) = Pi A ... A P^, that is, for some /I, /2, v{Gl) = 
Pi A . . . A P/i, v{Bpd) = P/1+1 A ... A P/2, and v{Gr) = P/2+1 A ... A P^. By 
Proposition 4 and the inductive hypotheses (Soundness and Completeness), for 
ft = /I -I- 1, . . . , /2, if P/i is an atom then there exists a proof tree for Lh and Pj, 
and if Lh is a negated atom ^Ah then there is no a proof tree for Ah and Pj . By 
Proposition 3, by the fact that (by ii) V ^ v{dpd), and by the fact that Sp G Pi 
(recall that DefSk C Pi), we have that there exists a proof tree for v{Kd) and 
Pj. Moreover, since K ^ dp A Bp has been unfolded w.r.t. a positive literal, we 
have that: 

(t) Kv{Bpd),Pj) > fi{v{Kd),Pj) 

By Proposition 4 and the inductive hypothesis (Completeness), there exists a 
proof tree for v{Kd) and Pk. Since T is P/-consistent we have that, for ft = 
1, . . . , r, p.{A, Pj) > p,{Lh, Pj). Moreover, we have that: 
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/i(A, Pj) > ^{v{Gl a Epd A Gr), Pj) (because T is Pj-consistent) 

= ^\v{GL),Pj) © ^J.{v{Bp■^),Pj) © ^{v{Gr), Pj) (by definition of 

> ^i{v{GL),Pj)® ^Av{K^),Pj)® ^i{v{GR),P^ (by (f)) 

> ji{v{K'd), Pj) (by definition of /x) 

By the inductive hypotheses (I^) and (IS), for h = 1, . . . , / I, /2 + 1, . . . , r, if Lh 
is an atom then there exists a Pj-consistent proof tree Uh for Lh and Pk+i, and 
if Lh is a negated atom ^Ah then there is no a proof tree for Ah and Pk+i- 
Moreover, by the inductive hypothesis (I/x), there exists a Pj-consistent proof 
tree U for v{K'd) and Pk+i- 

Now we construct a Pj-consistent proof tree U for A and Pfc+i as follows. 
By (a) and (b) we have that v{H) = A and T> (= v{c). Thus, we construct 
the children of A in P by using the clause ry. H ^ c A Gr L Ki} A Gr. Since 
v{Gr a K'd A Gr) = Li a . . . a P /1 a v{K'd) A L/ 2+1 A ... A L^, the children of 
A in P are: Li, . . . ,Lfi,v{K‘d), L/ 2 + 1 , • ■ • j L^. The construction of P continues 
as follows. For h = 1, . . . , /I, /2 + 1, . . . , r, if Lh is an atom then Uh is the 
Pj-consistent subtree of P rooted in Lh, else if Lh is a negated atom then Lh is 
a leaf of P. Finally, the subtree of P rooted in v{K-d) is the P^ -consistent proof 
tree P. 

The proof tree P is indeed Pj-consistent because: (i) for h = 1, . . . , /I, 
/2+1, ...,r, fJ.{A,Pj) > fx{Lh,Pj), (ii) ^i{A, Pj)> ^i{v{Kd),Pj), and (iii) ev- 
ery subtree rooted in one of the literals Li, . . . , L/i, u(L'r9), L/ 2 + 1 , . . . , L^ is Pj- 
consistent. □ 
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Abstract. We present the latest version of the logen partial evaluation 
system for logic programs. In particular we present new binding-types, 
and show how they can be used to effectively specialise a wide variety of 
interpreters. We show how to achieve Jones-optimality in a systematic 
way for several interpreters. Finally, we present and specialise a non- 
trivial interpreter for a small functional programming language. Exper- 
imental results are also presented, highlighting that the LOGEN system 
can be a good basis for generating compilers for high-level languages. 



1 Introduction 

Partial evaluation [21] is a source-to-source program transformation technique 
which specialises programs by fixing part of the input of some source program 
P and then pre-computing those parts of P that only depend on the known part 
of the input. The so-obtained transformed programs are less general than the 
original but can be much more efficient. The part of the input that is fixed is 
referred to as the static input, while the remainder of the input is called the 
dynamic input. 

Partial evaluation is especially useful when applied to interpreters. In that 
setting the static input is typically the object program being interpreted, while 
the actual call to the object program is dynamic. Partial evaluation can then pro- 
duce a more efficient, specialised version of the interpreter, which is sometimes 
akin to a compiled version of the object program [10]. 

The ultimate goal in that setting is to achieve so-called Jones optimality 
[19,21,36], i.e., fully getting rid of a layer of interpretation (called the “optimal- 
ity criterion” in [21]). More precisely, if we have a self-interpreter sint for a 
programming language L, i.e., an interpreter for L written in that same lan- 
guage L, and then specialise sint for a particular object program p we would 
like to obtain a specialised interpreter p ’ which is as least as efficient as p (see 
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Figure 1). The reason one uses a self-interpreter, rather than an interpreter in 
general, is so as to be able to directly compare the running times of p and p’ 
(as they are written in the same programming language L). 

More formally, if D is the input domain of p and tp(i) is the running time of 
the program p on the input i, we want that Wd € D : tp'(d) < tp(d). 




Fig. 1. Jones Optimality 



In this paper we study systematically how to specialise a wide variety of 
interpreters written in Prolog using so-called offline partial evaluation. We will 
illustrate this using the partial evaluation system logen. Starting from very 
simple interpreters we will progress towards more complicated interpreters. We 
will also show how we can actually achieve the goal of Jones optimality for a logic 
programming self-interpreter, as well as for a debugger derived from it; i.e., when 
specialising the debugger for an object program p with none of its predicates 
being spyed on we will always get a specialised debugger equivalent to p. We 
believe this to be the first result of its kind in a logic programming setting. In fact, 
how to effectively specialise interpreters has been a matter of ongoing research for 
many years, and has been of big interest in the logic programming community, 
see e.g., [42,47,44,5,7,26,50,28] to mention just a few. However, despite these 
efforts, achieving Jones optimality in a systematic way has remained mainly 
a dream. To our knowledge, Jones optimiality has been achieved only for a 
simple Vanilla self-interpreter in [50], but the technique does not scale up to 
more involved interpreters. All of these works have mainly tried to tackle the 
problem using fully automatic online partial evaluation techniques, while in this 
paper we are using the offline approach. Basically, an online specialise!' takes all 
of its control decisions during the specialisation process itself, while an ojfline 
specialise!' is guided by a preliminary binding-time analysis, which in our case will 
be (partially) done by hand. The basic reason we opt for the offline approach is 
that it allows to steer the specialisation process far better than online techniques. 
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This steering is of particular importance in the current setting, since all of the 
previous research using automatic online techniques has shown that specialising 
interpreters (in general and especially Jones optimality) is hard to achieve. 

The paper is structured as follows. In Section 2 we present the basics of 
offline partial evaluation and of the so-called cogen approach to specialisation 
employed by logen. The logen system itself is introduced in Section 2.3. In 
Section 3 we focus on offline techniques in logic programming as employed by 
LOGEN. We then show how a simple, non-recursive interpreter can be specialised 
in Section 4 before moving to a self-interpreter in Section 5, for which we achieve 
Jones-optimality. In Section 6 this self-interpreter is extended into a debugger, 
for which Jones-optimality is also achieved. Section 7 then presents more sophis- 
ticated features of logen, required to tackle interpreters for other programming 
paradigms. Their use is illustrated in Section 8. Finally, we conclude in Section 9. 



2 Offline Partial Evaluation and the Cogen Approach 

2.1 Offline Specialisation 

Inspired by the seminal work of Futamura [10], the functional partial evaluation 
community has put a lot of effort in developing self-applicable partial evaluators. 
The first successful self-application was reported in [22], and later refined in [23] 
(see also [21]). The main idea which made this self-application possible was to 
separate the specialisation process into two phases, as depicted in Figure 2: 

- First a binding-time analysis (BTA for short) is performed which, given 
a program and an approximation of the input available for specialisation, 
approximates all values within the program and generates annotations that 
steer (or control) the specialisation process. 

- A (simplified) specialisation phase, which is guided by the result of the BTA. 
Such an approach is ojfline because most control decisions are taken before- 
hand. The interest for self-application lies with the fact that only the second, 
simplified phase has to be self-applied. We refer to [22,23,21] for further details. 
In the context of logic programming languages the offline approach was used to 
achieve self-application in [39,15] and more recently in [8]. 

2.2 The Cogen Approach 

Given a self-applicable partial evaluator, one can construct a so-called compiler 
generator (a cogen for short) using Futamura’s third projection (see e.g. [21]). 
A cogen is a program that given a binding-time annotated program produces 
a specialiser for that program. If the annotated program is an interpreter, this 
specialiser can be viewed as a compiler, hence the name “compiler generator.” 
Obtaining an efficient cogen by self-application is a quite difficult task. This 
has led several researchers to pursue the so-called cogen approach to program 
specialisation [17,18,4,1,14,48]. The idea behind this approach is to write the 
cogen directly by hand, rather than trying to obtain it by self-application. This 
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Fig. 2. Offline Partial Evaluation 



turns out to be less difficult than one could imagine. Also, from a user’s point of 
view, it is not important how a cogen was generated; what is important is that a 
cogen exists and that it is efficient and produces efficient, non-trivial specialised 
specialisers. 



2.3 Overview of logen 

The application of the cogen approach in a logic programming setting has lead to 
the LOGEN system [24,31], which we describe in more detail in the next section. 

Figure 3 highlights the way the LOGEN system works. Typically, a user would 
proceed as follows: 

- First the source program is annotated using the BTA, which produces an 
annotated source program. This annotated source program can be further 
edited.^ This also allows an expert to inspect and manually refine the anno- 
tations to get better specialisation. 

We have developed a special LOGEN Emacs mode as well as a Tcl/Tk editor for 
this task. The figure does not show that LOGEN now also contains a term expansion 
package (for SICStus and Ciao Prolog) that strips the annotations when loading 
the annotated source program, allowing the annotated source program to be run 
directly. 
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Fig. 3. Illustrating the logen system and the cogen approach 



Second, LOGEN is run on the annotated source program and produces a 
specialiser for the source program, called a generating extension. 

This generating extension can now be used to specialise the source program 
for some static input. Note that the same generating extension can be run 
many times for different static inputs (i.e., there is no need to re-run LOGEN 
on the annotated source program unless the annotated source program itself 
changes). 

When the remainder of the input is known, the specialised program can now 
be run and will produce the same output as the original source program. Note 
again, that the same specialised program can be run for different dynamic 
inputs; one only has to re-generate the specialised program if the static input 
changes (or the original program itself changes). 
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3 Offline Partial Deduction of Logic Programs 

We now describe the process of offline partial evaluation of logic programs and 
give a better understanding of how logen specialises its source programs. 

Throughout this paper, we suppose familiarity with basic notions in logic 
programming. We follow the notational conventions of [34]. In particular, in 
programs, we denote variables by strings starting with an upper-case symbol, 
while the notations for constants, functions and predicates begin with a lower- 
case character. 

3.1 Partial Deduction 

The term “partial deduction” has been introduced in [25] to replace the term 
partial evaluation in the context of pure logic programs (no side effects, no cuts) . 
Though in some parts of the paper we briefly touch upon the consequences of 
impure language constructs, we adhere to this terminology because the word 
“deduction” places emphasis on the purely logical nature of most of the source 
programs. Before presenting partial deduction, we first present some aspects of 
the logic programming execution model. 

Formally, executing a logic program P for an atom A consists of building 
a so-called SLD-tree for P U A} and then extracting the computed answer 
substitutions from every non-failing branch of that tree. Take for example the 
well-known append program: 

append ( [] ,L,L) . 

appendC [HiX] ,Y, [HiZ] ) append (X, Y, Z) . 

For example, the SLD-tree for appendC [a,b] , [c] ,R) is presented on the left 
in Figure 4. The underlined atoms are called selected atoms. Here there is only 
one branch, and its computed answer is R = [a,b,c] . 



append( [a,b] , [c] ,R) 



append ( [ b ] , [ c ] , R2 ) 



append ( [ ] , [c] ,R3) 



□ 
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Partial deduction builds upon this approach with two major differences: 

— At some step in building the SLD-tree, it is possible not to select an atom, 
hence leaving a leaf with a non-empty goal. The motivation is that lack of 
the full input may cause the SLD-tree to have extra branches, in particular 
infinite ones. For example, in Figure 4 the rightmost tree is an incomplete 
SLD-tree for append (X, [c] ,R), whose full SLD-tree would be infinite. The 
partial evaluator should not only avoid constructing infinite branches, but 
also other branches causing inefficiencies in the specialised program. Building 
such a tree is called unfolding. An unfolding rule tells us which atom to select 
at which point. Incomplete branches do not produce computed answers, they 
produce conditional answers which can be expressed as program clauses by 
taking the resultants of the branches as defined below. 

— Because of the atoms left in the leaves (in the bodies of the resultants), we 
may have to build a series of SLD-trees to ensure that every such atom is 
covered by some root of some tree. The fact that every leaf is an instance 
of a root is called closedness (sometimes also coveredness). In the example 
of Figure 4 the leaf atom append (X2, [c] ,R2) is already an instance of its 
root atom, hence closedness is already ensured and there is no need to build 
more trees. 

Definition 1. Let P he a program, G Q a goal, D a finite SLD- derivation 
of PU {G} ending in ^ B, and 6 the composition of the mgus in the derivation 
steps. Then the formula Q9 <— B is called the resultant of D. 

E.g., the resultants of the derivations in the right tree of Figure 4 are: 
append ( [] , [c] , [c] ) . 

appendC [H I X2] , [c] , [H I R2] ) :- append (X2, [c] ,R2) . 

Partial deduction starts from an initial set of atoms A provided by the user 
that is chosen in such a way that all runtime queries of interest are closed, i.e., 
are an instance of some atom in A. As we have seen, constructing a specialised 
program requires us to construct an SLD-tree for each atom in A. Moreover, one 
can easily imagine that ensuring closedness may require revision of the set A. 
Hence, when controlling partial deduction, it is natural to separate the control 
into two components (as already pointed out in [11,38]): 

— The local control controls the construction of the finite SLD-tree for each 
atom in A and thus determines what the residual clauses for the atoms in A 
are. 

~ The global control controls the content of A, it decides which atoms are 
ultimately partially deduced (taking care that A remains closed for the initial 
atoms provided by the user). 

More details on exactly how to control partial deduction in general can be 
found, e.g., in [29]. In offline partial deduction the local control is hardwired, 
in the form of annotations added to the source program (either by the BTA, 
the user, or both). The global control is also partially hard- wired, by specifying 
which arguments to which predicate are dynamic and which ones are static. 
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3.2 An Offline Partial Deduction Algorithm 

As already outlined earlier, an offline specialiser works on an annotated version 
of the source program. In our approach, we use two kinds of annotations: 

- Filter declarations, which declare which arguments to which predicates are 
static and which ones dynamic. This influences the global control only. 

- Clause annotations, which indicate for every call in the body how that call 
should be treated during unfolding. This thus influences the local control 
only. For now, we assume that a call is either annotated by memo — indi- 
cating that it should not be unfolded - or by unfold — indicating that it 
should be unfolded. We introduce more annotations later on. 

There is of course an interplay between these two kinds of annotations, and 
we return to this below. 

First, let us consider as example an annotated version of the append program 
from above in which the filter declarations annotate the second argument as 
static while the others are dynamic and the clause annotations annotate the 
recursive call as memo to prevent its unfolding. Given such annotations and 
a specialisation query appendix, [c] ,Z), offline partial deduction would unfold 
exactly as depicted in the right tree of Figure 4 and produce the resultants above. 

The following is a general algorithm for offline partial deduction given filter 
declarations and clause annotations. 

Algorithm 3.1 (offline partial deduction) 

Input: A program P and an atom A 
M = {A} 
repeat 

select an unmarked atom A in M and mark it 

unfold A using the clause annotations in the annotated source program 
if a selected atom S is annotated as memo then 
generalise S into S' by replacing all arguments declared as dynamic 
by the filter declarations with a fresh variable 
if no variant of S' is in M then add it to M end 
end 

pretty print the specialised clauses of A 
until all atoms in M are marked 

In practice, renaming transformations [12] are also involved: Every atom in 
M is assigned a new predicate name, whose arity is the number of arguments 
declared as dynamic (static arguments do not need to be passed around; they 
have already been built into the specialised code). For example, the resultants 
of the derivations in the right tree of Figure 4 would get transformed into the 
following, where the second static argument has been removed: 

append 0( [] , [c] ) . 

append__0( [H|X2] , [H|R2] ) :- append__0(X2,R2) . 
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To give a more precise picture, we present a Prolog version of the above 
algorithm. The code is runnable (using an implementation of gensym, see [45], 
to generate new predicate names). We assume that the filter declarations and 
clause annotations of the source program are represented by the definition of 
a f liter /2 and rule/2 predicate respectively. We discuss a more user-friendly 
representation of these annotations in logen later in the chapter. 

An atom A is specialised by calling memo(A,Res) in the code below. The 
memo/2 and memo_table/2 predicates return in their second argument the call 
to the new specialised predicate where the static arguments are removed and 
the dynamic ones generalised. This generalisation and filtering is performed 
by the generalise^nd_f ilter/3 predicate that returns in its second argu- 
ment the generalised original call (to be unfolded) with fresh variables and 
in its third argument the corresponding call to the specialised predicate. It 
uses the annotations as defined by the filter/2 predicate to perform its task. 
The call memo_table(X,ResX) within the definition of memo/2 simply binds 
ResX to the residual version of the call X. Note the difference between ResX, 
GenX and FX. Consider for example the filter declaration for app given below 
with X = app(S, [] ,S) as call. The generalised call to be unfolded, GenX be- 
comes app(Y, [] ,Z); FX, the head of the specialised version becomes for ex- 
ample app_0(Y,Z) in which case the original call is to be replaced by ResX = 
app_0(S,S). 

The predicate unfold/2 computes the bodies of the specialised predicates. 
A call annotated as memo is replaced by a call to the specialised version. It 
is created, if it does not exist, by the call to memo/2. A call annotated as un- 
folded is further unfolded. To be able to deal with built-ins, we also add two 
more annotations: a call annotated as call is completely evaluated; finally, a 
call annotated as rescall is added to the residual code without modification (for 
built-ins that cannot be evaluated) . These two annotations can also be useful for 
user-predicates (a user predicate marked as call is completely unfolded without 
further examination of the annotations, while the rescall annotation can be use- 
ful for predicates defined elsewhere or whose code is not annotated). All clauses 
defining the new predicate are collected using findall/3 and pretty printed. 

dynamic memo_table/2 . 
memo (X, ResX) (memo_table(X,ResX) 

-> true /* nothing to be done: already specialised */ 

; (generalise_and_f ilter(X,GenX,FX) , 
assert (memo_table (GenX , FX) ) , 
findall((FX:-B) , unfold (GenX, B) .XClauses) , 
pretty_print_clauses (XClauses) ,nl , 
memo_table(X,ResX) ) ). 

unf old(X,Code) rule(X,B), body(B,Code) . 

body((A,B) , (CA,CB)) body(A,CA) , body(B,CB) . 
body (memo (X) , ResX) memo (X, ResX) . 
body (unf old(X) .ResCode) unfold(X,ResCode) . 

body(calKC) .true) call(C). 
body(rescalKC) ,C) . 
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generalise_£md_filter(Call,GCall,FCall) filter(Call,ArgTypes) , 

Call =. . [P I Args] , 

gen_f liter (ArgTypes , Args , GenArgs , Flit Args) , 

GCall = . . [PI GenArgs] , 

gensym(P,NewP) , FCall =.. [NewP I FiltArgs] . 
gen_filter( [],[], [],[]). 

gen_f liter ( [static I AT] , [Arg I ArgT] , [Arg I GT] ,FT) : - 
gen_f liter (AT, ArgT,GT,FT) . 

gen_f liter ( [dynamic I AT] , [_ I ArgT] , [GenArgiGT] , [GenArgiFT]) 
gen_f liter (AT, ArgT, GT, FT) . 

Let us now examine the behaviour of this specialiser for our earlier append example. 
First, we have to produce an annotated version of the append program: 

/* the annotated source program: */ 

/* filter Indicates how to generalise Euid filter */ 
f liter (app(_ ,_, _) , [dynamic, static, dynamic] ) . 

/* rule annotates the clauses and Indicates how to unfold */ 

rule(app( [] ,L,L) , call (true) ) . 

rule(app( [H I X] , Y, [H I Z] ) ,memo(app(X, Y,Z) ) ) . 

Calling the specialiser with memo (app (X , [c] , Y) ) produces the following spe- 
cialised program as output: 

app 1 ( [] , [c] ) : -true 

app__l ( L12855 I _12856] , [_12855 I _12854] ) : - app__l (_12856 , _12854) . 

The full treatment in logen is a lot more complicated as logen supports 
a more user friendly syntax as well as various features to be introduced in the 
next sections. 



3.3 Local and Global Termination 

Without proper annotations of the source program, the above offline specialiser 
may fail to terminate. There are essentially two reasons for nontermination. 

— Local nontermination: The unfolding predicate unfold/2 may fail to ter- 
minate or provide infinitely many answers. 

— Global nontermination: Even if all calls to unfold/2 terminate, we may 
still run into problems because the partial evaluator may try to build in- 
finitely many specialised versions of some predicate for infinitely many dif- 
ferent static values.® 

To overcome the first problem, we may have to annotate certain calls as 
memo rather than unfold. In the worst case, every call is annotated as memo 

® One often tries to ensure that a static argument is of so-called bounded static varia- 
tion [21], so that global termination is guaranteed. 
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which always ensures local termination (but means that little or no specialisation 
is performed). 

To overcome global termination problems, we have to play with the filter 
declarations and declare more arguments as dynamic rather than static. 

Another possible problem appears when built-ins lack enough input to behave 
as they do at run-time (either by triggering an error or by giving a different 
result). When this happens, we have to mark the offending call as rescall rather 
than call. 

4 Propositional Logic Interpreter 

We first introduce a simple propositional logic interpreter to demonstrate the 
basic annotations. The interpreter will accept and, or, not, implies and proposi- 
tional variables. The int{Prog, Env, Result) predicate takes two input arguments, 
the propositional formula and the environment containing a truth function for 
the propositional variables and produces the result. The environment is a list of 
truth values; var(i) indexes the element in the environment. 

not (true , false) . 
not(false,true) . 
and (true, true ,true). 
and(false,_ .false). 
and(true .false .false) 

int (true , _ , true ) . 
int (false , _ .false) . 
int(implies(X,Y) ,Env, Z) int(or(not(X) ,Y) ,Env,Z) . 
int(and(X,Y) ,Env, Z) int(X,Env,Rl) ,int(Y,Env,R2) ,and(Rl,R2,Z) . 
int(or(X,Y) ,Env, Z) int(X,Env,Rl) ,int(Y,Env,R2) ,or(Rl,R2,Z) . 
int (not (X) , Env, Z) int(X,Env,Rl) ,not(Rl,Z) . 
int(var(X) ,Env, Z) lookup (X, Env, Z) . 

lookup (0, [X|_] ,X) . 

lookup (N, [XI T] ,Y) N>0, N1 is N-1, lookup (N1 ,T,Y) . 

As was indicated in Figure 3, the source program that serves as input for 
LOGEN needs annotations. The filter declaration declares how the arguments 
of the residual predicates have to be treated. The annotation static announces 
that the value of argument will be known at specialisation time; the annota- 
tion dynamic that the value of the argument will not necessarily be known at 
specialisation time. Top level predicates that one intends to specialise must be 
declared in this way, as well as any subsidiary predicate which cannot be fully 
unfolded. 

The syntax for logen’s filter declarations is more user-friendly than that 
used in the previous section. For example, for the propositional interpreter we 
could declare: 



or (true ,_ .true), 
or (false, true, true) . 
or(false, false, false) . 
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filter int(static, dynamic, dynamic), 
filter lookup (dynamic, dynamic, dynamic). 

In other words, we assume that the propositional formula (the first argument 
of int/3) is known at specialisation time (static) while the environment will 
only be known at runtime (dynamic). 

Next we must annotate the clauses in the original program to control the 
specialisation. This has to be done either manually by the user (possibly with 
the help of some annotation aware editor) or by an automatic binding-time 
analysis. The following constructs can be used to annotate the calls in the clause 
bodies of the program: 

— unfold for reducible predicates; they will be unravelled during specialisation, 

— memo for non-reducible predicates; they will be added to the memoisation 
table and replaced with a generalised residual predicate, 

~ call for built-ins or user defined predicates that should be fully evaluated 
without further intervention of the specialiser. 

— rescall for calls to be kept as such in the specialised code. In contrast to 
the memo annotation, no specialised predicate definition is produced for 
the call. This annotation is especially useful for built-ins, but can also be 
useful for user predicates (e.g., because the code is not available at speciali- 
sation time). The example below will highlight the difference with the memo 
annotation. 

As the propositional formula is known at specialisation time (static) all calls 
to int/3 can be unfolded. As concerns the variable lookups in the environment, 
let us first be cautious and mark the call to lookup as a rescall: 
int (var (X) ,Env, Z) : - lookup(X, Env, Z). 

rescall 

Let us specialise the interpreter for the logical formula: 

((uar(O) V {var{l) A^var{2)))V false) A true. The output from specialisation is a 
new version of the program representing the truth table for the formula; as the 
call to lookup was marked as rescall, several instantiated occurrences appear in 
each resultant. 

int(and(or(or(var(0) ,and(not(var(l)) ,var(2))) , false) ,true) ,Env,R) 

:- int 0(Env,R). 

int 0(A,true) :- 

lookupCO, A,true) ,lookup(l,A,true) , lookup (2, A, C) . 
int 0(A, false) :- 

lookupCO, A, false) , lookup (1, A, true) , lookup (2, A, C) . 
int 0(A,true) :- 

lookupCO, A, true) ,lookup(l,A,false) ,lookup(2,A,true) . 
int 0(A,true) :- 

lookupCO, A, false) ,lookup(l,A,false) , lookup (2, A, true) . 
int 0(A,true) :- 

lookupCO, A, true) ,lookup(l,A,false) ,lookup(2,A,false) . 




352 



Michael Leuschel et al. 



int 0(A, false) 

lookup(0,A,false) ,lookup(l,A,false) ,lookup(2,A,false) . 

Observe that no specialised predicate has been produced for lookup/3, as 
we have used the rescall annotation. If we mark the call in int/3 to lookup/3 
as memo rather than rescall and within the clauses of lookup/3 we mark the 
built-ins as rescall and the recursive call as memo, we obtain a specialised 
program containing lookup_l/3, a specialised version of lookup/3; however, 
the specialised version is but a renaming of the original as all its arguments 
where declared as dynamic: 

int 0(A,true) 

lookup 1(0, A, true) .lookup 1(1, A, true) .lookup 1(2, A, B) . 

lookup 1 (0, [B I C] ,B) . 

lookup__l(B, [ClD] ,E) B > 0, F is (B - 1) , lookup__l (F,D,E) . 

One may notice that in all calls to lookup/3 the first argument is actually 
static. One may thus think of changing the filter declaration for lookup/3 into: 

:- filter lookup (static, dynamic, dynamic). 

Unfortunately, if we now run logen we get a specialisation time error. In- 
deed, in the recursive call lookup(Nl ,T, Y) in second clause of lookup/3 the 
variable N1 will be unbound at specialisation time, and hence LOGEN will com- 
plain. The problem is that we have not evaluated the call N1 is N-1 which binds 
Nl. Indeed, what we need to do is to annotate the clause as follows: 
lookup (N, [XlT] ,Y) Nl is^N - 1 , lookup(Nl, T, Y). 

call call memo 

There is actually no need to memo the calls to lookup: given that we know 
the first argument we can annotate all calls to lookup/3 as unfold and logen 
will produce the following program: 

int 0 ( [true , true , B I C] , true) . 

int 0( [false, true, B|C] .false) . 

int 0( [true , false .true |B] .true) . 

int 0( [f alse, false, true I B] .true) . 

int 0( [true, false, false I B] .true) . 

int 0( [false , false, false I B] .false) . 

It is actually possible to obtain an even better specialisation than this, by 
providing more information about the structure of the environment. For that we 
need more sophisticated filter annotations, which we introduce later in Section 7. 
As a teaser, after declaring 

filter int (static, list (dynamic) , dynamic), 
one can specialise the interpreter for the call: 

int (and(or (or (var (0) , and (not (var (1) ) , var (2) ) ) .false) .true) , [A,B,C] ,D) 
obtaining the following more efficient specialised program: 
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int 0 (true, true, B, true) . 

int 0(false, true, B, false) . 

int 0 (true, false, true, true) . 

int 0(false, false, true, true) . 

int 0(true, false, false, true) . 

int 0(false, false, false, false) . 

Indeed, the environment list has vanished and need not to be manipulated. 



5 Specialising the Vanilla Self-interpreter 

5.1 Background 

A classical benchmark for partial deduction has been the so-called vanilla meta- 
interpreter (see, e.g., [16,3]). This interpreter is a self-interpreter because it can 
handle the language in which it is written. The following is the vanilla meta- 
interpreter, along with an encoding of the double-append object program: 

solve (empty) . 

solve(and(A,B)) solve(A), solve(B). 

solve(X) clause(X,Y), solve(Y). 

clause (dapp(X,Y,Z,R) ,and(app(Y,Z,YZ) ,app(X,YZ,R))) . 

clause (app ( [] , L , L) , empty) . 

clause (app([H IX] ,Y, [H|Z] ) ,app(X,Y,Z)) . 

The clause/2 facts describe the object program to be interpreted, while 
solve/ 1 is the meta-interpreter executing the object program. In practice, solve 
will often be instrumented so as to provide extra functionality for, e.g., debug- 
ging, analysis (e.g., using abstract unifications instead of concrete unification) 
or transformation. We will actually do so later in this section. However, even 
without these extensions the vanilla interpreter provides enough challenges for 
partial deduction. Indeed, we would like to specialise the interpreter so as to 
obtain a residual program at least as efficient as the object program being inter- 
preted. For example, one would like to specialise our vanilla interpreter for the 
query solve (dapp(X,Y,Z,R)) and obtain a specialised interpreter which is at 
least as efficient as: 

dapp(X,Y,Z,R) app(Y,Z,YZ) ,app(X,YZ,R) . 
app( [] ,L,L) . 

app([H|X] ,Y, [HIZ]) app(X,Y,Z) . 

As we have seen in the introduction (cf. Figure 1), achieving such a feat for 
every object program and query is called “Jones-optimality” [19,36]. 

Online partial evaluators such as ecce [32] or mixtus [43] come close to 
achieving Jones-optimality for many object programs. However, they will not 
do so for all object programs and we refer the reader to [37] (discussing the 
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parsing problem) and the more recent [50] and [28] for more details. [50] presents 
a particular specialisation technique that can achieve Jones-optimality for the 
vanilla interpreter, but the technique is very specific to that interpreter and, as 
far as we understand, does not scale to extensions of it. 

In the rest of this section we show how logen can achieve Jones-optimality 
for the vanilla interpreter, and we show how we can then handle extensions of 
the basic interpreter. 



5.2 The Nonvar Binding Time Annotation 

First, we have to present a new feature of logen which is useful when spe- 
cialising interpreters. In addition to marking arguments to predicates as static 
or dynamic, logen also supports the annotation nonvar. This means that the 
argument is not necessarily ground but has at least a top-level function symbol 
at specialisation time. When generalising the call, logen keeps the top-level 
function symbol while replacing all its sub-arguments by fresh variables. Finally, 
these subarguments become arguments in the specialised version constructed by 
logen. 

A small example will help to illustrate this annotation: 



filter p(nonvar). 
p(f (X,X)) p(g(a)) . 

p(g(X) ) : - p(h(X) ) . 
p(h(a) ) . 

p(h(X)) p(f (X,X)) . 

Marking every call as memo (hence no unfolding), we obtain the following 
specialised program for the call p(f(Z,Z)). The first comment line indicates the 
renamings that LOGEN has performed. 



Ut p(f(A,B)) p__0(A,B) 

p 0(A,A) p 1(a). 

p__l(A) p__2(A). 

p 2(a) . 

p__2(A) p__0(A,A). 



p(g(A)) :-p 1(A) . p(h(A)) :-p 2(A) . 



If we mark the last call as memo and all others as unfold, we obtain: 

V/:L p(f(A,B)) p__0(A,B). 

p__0(A,A) . 

p 0(A,A) :- p 0(a,a) . 



5.3 Jones-Optimality for Vanilla 

The vanilla interpreter as shown above, is actually a badly written program as 
it mixes the control structures and and empty with the actual calls to predicates 
of the object program. This means that the vanilla interpreter will not behave 
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correctly if the object program contains predicates and/2 or empty/0. This fact 
also poses problems typing the program. Even more importantly for us, it also 
prevents one from annotating the program effectively for logen. Indeed, stati- 
cally there is no way to know whether any of the three recursive calls to solve/ 1 
has a control structure or a user call as its argument. For logen this means that 
we can only mark the call clause (X,Y) as unfold. Indeed, if we mark any of 
the solve/1 calls as unfold we may get into trouble, i.e., non-termination of 
the specialisation process. This also means that we cannot even mark the argu- 
ment to solve/1 as nonvar, as it may actually become a variable. Indeed, take 
the call solve (and(p, q) ) : it will be generalised into solve (and(X, Y) ) and after 
unfolding with the second clause we get the calls solve (X) and solve (Y) . Hence 
we obtain very little specialisation and we will not achieve Jones-optimality. 
Two ways to solve this problem are as follows: 

- Assume that the control structures are used in a principled, predictable way 
that will allow us to produce a better annotation. 

- Rewrite the interpreter so that it is clearly typed, allowing us to produce 
an effective annotation as well as solving the problem with the name clashes 
between object program and control structures. 

We will pursue these solutions in the remainder of this section. A third pos- 
sible solution is to use more precise annotations which we introduce later in 
Section 7. This will give some improvements, but not full Jones optimality, due 
to the bad way in which solve is written. 

Structuring Conjunctions. The first solution is to enforce a standard way 
of writing down conjunctions within clause/2 facts by requesting that every 
conjuctions is either empty or is an and whose left part is an atom and the right 
hand a conjunction. For the example above, this means that we have to rewrite 
the clause/2 facts as follows: 

clause (dapp(X,Y,Z,R) ,and(app(Y,Z,YZ) ,and(app(X,YZ,R) .empty))) . 
clause (app ( [] , L , L) , empty) . 

clause (app( [H|X] ,Y, [H|Z] ) ,and(app(X,Y,Z) .empty)) . 

This allows us to predict what to find within the arguments of a conjunction 
and thus we can now annotate the interpreter more effectively, without risking 
non-termination: 

:- filter solve (nonvar) . 
solve (empty) . 

solve(and(A.B)) :- solve(A). solve(B). 

memo unfold 

solve(X) clause(X,Y), solve(Y). 

unfold unfold 

Given our assumption about the structure of conjunctions, the above anno- 
tation will still ensure termination of the generating extension: 

~ Local termination: The call to clause(X.Y) can be unfolded as before 
as clause/2 is defined by facts. The calls solve (B) and solve (Y) can be 
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unfolded as we know that B and Y are conjunctions, logen will deconstruct 
the and/2 and empty/0 function symbols. However, as solve (A) is marked 
memo, the possibly recursive predicates of the object program are not un- 
folded. 

— Global termination: At the point when we memo solve (A) the variable 
A will be bound to a predicate call. As we have marked the argument to 
solve/1 as nonvar, generalization will just keep the top-level predicate 
symbol. As there are only finitely many predicate symbols, global termina- 
tion is ensured. 

Specialising for solve (dapp(X,Y,Z,R)) now gives a Jones-optimal output. 

7«7«7« solve(dapp(A,B,C,D)) solve 0(A,B,C,D). 

7«7«7« solve(app(A,B,0) solve 1(A,B,C). 

solve 0(B,C,D,E) solve 1(C,D,F), solve 1(B,F,E). 

solve 1 ( [] ,B,B) . 

solve__l([B|C] ,D, [BlE]) solve__l (C,D,E) . 

LOGEN will in general produce a specialised program which is slightly better 
than the original program in the sense that it will generate code only for those 
predicates that are reachable in the predicate dependency graph from the initial 
call. E.g., for solve (app(X,Y,R)) only two clauses for app/3 will be produced, 
not a clause for dapp/4. 

It is relatively easy to see that Jones optimality will be achieved for any prop- 
erly encoded object program and any call to the object program. Indeed, any 
call of the form solve (p(ti, . . . ,tn)) will be generalised into solve (p(_,. 
keeping information about the predicate being called; unfolding this will only 
match the clauses of p as the call clause (X,Y) is marked unfold and all of the 
parsing structure (and/2 and empty/0) will then be removed by further unfold- 
ing, leaving only predicate calls to be memoised. These are then generalised and 
specialised in the same manner. 



Rewriting Vanilla. The more principled solution is to rewrite the vanilla in- 
terpreter, so that the conjunction encoding and the object level atoms are clearly 
separated. The attentive reader may have noticed that above we have actually 
enforced that conjunctions are encoded as lists, with empty/0 playing the role of 
nil/0 and and/2 playing the role of ./2. The following vanilla interpreter makes 
this explicit and thus properly enforces this encoding. It is also more efficient, 
as it no longer attempts to find definitions of empty and and within the clause 
facts. 



solve ( [] ) . 

solve ([H I t]) solve_atom(H) , solve (T). 

solve_atom(H) clause (H,Bdy) , solve(Bdy). 
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clause (dapp (X, Y, Z, R) , [app(Y,Z,YZ) , app(X,YZ,R)] ) . 
clause(app( [] ,R,R) , [] ) . 

clause (app([H IX] ,Y, [H|Z]), [app(X,Y,Z)] ) . 

We can now annotate all calls to solve as unfold, knowing that this will 
only deconstruct the conjunction represented as a list. However, the call to 
solve_atom cannot be unfolded, as with recursive object programs we may per- 
form infinite unfolding, logen now produces the following specialised program 
for the query solve_atom(dapp(X,Y,Z,R)), having marked the argument to 
solve_atom calls as nonvar.® 

solve_atom 0(B,C,D,E) 

solve_atom 1(C,D,F) ,solve_atom 1(B,F,E) . 

solve_atom 1( [] ,B,B) . 

solve_atom 1 ( [B I C] ,D, [B I E] ) solve_atom 1(C,D,E). 

We have again achieved Jones-Optimality, which holds for any object pro- 
gram and any object-level query. 

An almost equivalent solution would be to improve the original vanilla inter- 
preter so that atoms are tagged by a special function symbol, e.g., as follows: 

solve (empty) . 

solve(and(A,B)) solve(A), solve(B). 

solve (atom(X) ) solve_atom(X) . 

solve_atom(H) clause (H,Bdy) , solve(Bdy). 

clause (dapp (X,Y,Z,R) ,and(atom(app(Y,Z,YZ)) ,atom(app(X,YZ,R)))) . 
clause (app ( [] , L , L) , empty) . 

clause (app( [H|X] ,Y, [H|Z] ) , atom (app (X,Y,Z))) . 

We have again clearly separated the control structures from the predicate 
calls and we can basically get the same result as above (by marking all calls to 
solve as unfold and the call to solve_atom as memo). 



Reflections. So, what are the essential ingredients that allowed us to achieve 
Jones optimality where others have failed? 

— First, the offline approach allows us to precisely steer the specialisation pro- 
cess in a predictable manner: we know exactly how the interpreter will be spe- 
cialised independently of the complexity of the object program. A problem 
with online techniques is that they may work well for some object programs, 
but then be “fooled” by other (more or less contrived) object programs; see 
[50,28]. (On the other hand, online techniques are capabable of removing sev- 
eral layers of self-interpretation in one go. An offline approach will typically 
only be able to remove one layer at a time.) 

® The predicate solve does not have to be given a filter declaration as it is only 
unfolded and never residualised. 
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— Second, it was also important to have sufficiently refined annotations at our 
disposal. Without the nonvar annotation we would not have been able to 
specialise the original vanilla self-interpreter: we cannot mark the argument 
to solve as static and marking it as dynamic means that no specialisation 
will occur. Hence, considerable rewriting of the interpreter would have been 
required if we just had static and dynamic at our disposal.^ 

— Third, it is important that the meta-interpreter is written in such a way 
that the specialiser can distinguish between conjunctions and object level 
calls and can treat them differently. 

6 Jones-Optimality for a Debugger 

Let us now try to extend the above interpreter, to do something more useful. 
The code below implements a tracing version of solve which takes two extra 
arguments: a counter for the current indentation level and a list of predicates to 
trace. 

dsolveC 

dsolveC [H|T] , Level, ToTrace) 

(debug (H,ToTrace) 

-> (indent (Level) , print (’Call: ’ ) ,print(H) ,nl, 
dsolve_atom(H,s (Level) , ToTrace) , 
indent (Level) .print ( ’Exit : ’ ) .print (H) ,nl) 

; dsolve_atom(H, Level, ToTrace) 

), 

dsolve(T, Level, ToTrace) . 

debug (Call, ToTrace) :- Call= . . [P I Args] , 

length (Args.Arity) , member (P/Arity, ToTrace) . 

:- filter indent (dynamiic) . 
indent (0) . 

indent (s(X)) :- print (’>’), indent (X) . 

:- filter dsolve_atom(nonvar, dynamic , static) . 
dsolve_atom(H, Level, TT) :- 

clause (H,Bdy) , dsolve(Bdy, Level, TT) . 

Basically, the annotation of dsolve and dsolve_atom calls are exactly as 
before: calls to dsolve are marked as unfold while calls to dsolve_atom are 
marked as memo. The if-then-else is marked call, i.e., it will be executed at 
specialisation time. As far as the new predicates are concerned, all calls to indent 
are marked memo, and all calls to print and nl are marked rescall. All other 
user defined predicate are marked as unfold and built-ins as call. Note that the 

^ We leave this as an exercise for the reader. See also Section 7.1 later in the paper. 
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above interpreter uses non-declarative predicates, and hence one has to be careful 
about “left-propagation” of bindings [43]. In our case, one has to be careful not 
to left-propagate bindings onto the first print (H) call, as this could change 
the observable behaviour of the debugger, logen provides special annotations 
(such as hide_nf, see [31]) to prevent these problems. However, in our case we 
do not need those annotations as the call dsolve_atom(H, s (Level) ,ToTrace) 
is marked memo and hence will not generate any bindings that could affect 
print (H) . 

For dsolve_atom(dapp( [a, a, a] , [b] , [c] ,R) ,0, [] ) we get the following al- 
most optimal code: 

dsolve_atom 0(B,C,D,E,F) 

dsolve_atom 1(C,D,G,F) , dsolve_atom 1(B,G,E,F) . 

dsolve_atom 1 ( [] ,B,B,C) . 

dsolve_atom 1 ( [B I C] ,D, [B I E] ,F) dsolve_atom 1(C,D,E,F). 

In fact, the extra last argument of both predicates can be easily removed by 
the FAR redundant argument filtering post-processing of [33] which produces a 
Jones-optimal result: 

dsolve_atom 0(A,B,C,D) :- 

dsolve_atom 1(B,C,E) ,dsolve_atom 1(A,E,D) . 

dsolve_atom 1 ( [] , A, A) . 

dsolve_atom 1( [A|B] ,C, [A|D] ) :- dsolve_atom 1(B,C,D). 

Again, is is not too difficult to see that LOGEN together with the FAR post- 
processor [33] produces a Jones-optimal result for every object program P and 
call C, provided that none of the predicates reachable from C are traced. 

For dsolve_atom(dapp( [a, a, a] , [b] , [c] ,R) ,0, [app/3] ) we get the fol- 
lowing very efficient tracing version of our object program, where the debugging 
statements have been weaved into the code. This specialised code now runs with 
minimal overhead, and there is no more runtime checking whether a call should 
be traced or not: 

dsolve_atom 0(B,C,D,E,F) :- 

indent 1 (F) .print ( ’Call : ’ ) .print (app(C.D.G) ) .nl . 

dsolve_atom 2(C.D.G.s(F)) . 

indent 1 (F) .print ( ’Exit : ’ ) .print (app(C.D.G) ) .nl . 

indent 1 (F) .print ( ’Call : ’ ) .print (app(B. G.E) ) .nl . 

dsolve_atom 2(B.G.E.s(F)) . 

indent 1 (F) .print ( ’Exit : ’ ) .print (app(B. G.E) ) .nl . 

indent 1 (0) . 

indent l(s(B)) :- print (’>’). indent 1(B). 

dsolve_atom 2( [] .B.B.C) . 

dsolve_atom 2 ( [B I C] . D . [B I E] . F) : - 

indent 1 (F) .print ( ’Call : ’ ) .print (app(C.D.E) ) .nl . 

dsolve_atom 2(C.D.E.s(F)) . 

indent 1 (F) .print ( ’Exit : ’ ) .print (app(C.D.E) ) .nl . 
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Running the specialised program for dsolve^tom_0( [a,b, c] , [] , [d] ,R,0), 
corresponding to the call dsolve_atom(dapp( [a,b,c] , [] , [d] ,R) ,0, [app/3] ) 
to the original program, prints the following trace: 

I ?- dsolve_atom 0( [a,b,c] , [] , [d] ,R,0) . 

Call: app([] , [d] ,_837) 

Exit : app( [] , [d] , [d] ) 

Call: app( [a,b,c] , [d] ,_525) 

>Call: app( [b,c] , [d] ,_1341) 

>>Call: app( [c] , [d] ,_1601) 

»>Call: app( [] , [d] ,_1891) 

»>Exit : app( [] , [d] , [d] ) 

»Exit: app( [c] , [d] , [c,d] ) 

>Exit: app( [b,c] , [d] , [b,c,d] ) 

Exit: app( [a,b,c] , [d] , [a,b,c,d] ) 

R = [a,b,c,d] ? 
yes 

Some Experimental Results. We now present some experimental results for 
specialising the solve and dsolve interpeters. The results are summarised in 
Table 1. The results were obtained on a Powerbook G4 running at 1 Ghz with 
1Gb RAM and using SIGStus Prolog 3.10.1. 

The partition4 object program calls append to partition a list into 4 iden- 
tical sublists, and has been run for a list of 1552 elements. The fibonacci 
object program computes the Fibonacci numbers in the naive way using Peano 
arithmetic. This program was benchmarked for computing the 24th Fibonacci 
number. Exact queries can be found in the DPPD library [27]. The FAR filter- 
ing [33] has not been applied to the specialised programs. The time needed to 
generate and run the generating extensions was negligible (more results, with 
full times can be found later in the paper for more involved interpreters where 
this time is more significant). 



Table 1. Specialising solve and dsolve using logen 



object program 


solve 


specialised 


speedup 


dsolve 


specialised 


speedup 


partition! 


350 ms 


200 ms 


1.75 


1590 ms 


220 ms 


7.23 


fibonacci 


890 ms 


170 ms 


5.24 


4670 ms 


180 ms 


25.94 



Adding More Functionality. It should be clear how one can extend the above 
logic program interpreters. A good exercise is to add more logical connectives, 
such as disjunction and implication, to the debugging interpreter dsolve and 
then see whether one can obtain something similar to the Lloyd-Topor trans- 
formations [35] automatically by specialisation (with the added benefit that de- 
bugging can still be performed at the source level). 
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We will now show how one can handle interpreters for other programming 
paradigms. In such a setting variables and their values may have to be stored in 
some environment structure rather than relying on the Prolog variable model. 
This will raise a new challenge, which we tackle next. 

7 More Sophisticated Annotations 

So far we have come by with just three annotations for arguments in filter decla- 
rations: static, dynamic, and nonvar. The latter denotes a simple kind of so-called 
partially static data [21]. For more realistic programs, however, it is often essen- 
tial to be able to deal with more sophisticated partially static data. For example, 
interpreters often have an environment, and at specialisation time we may know 
the actual variables store in the environment but not their value. Take the fol- 
lowing simple interpreter for arithmetic expressions using addition, constants 
and variables whose value is stored in an environment: 

int(cst(C) ,_E,C) . 

int(var(V) ,E,R) :- lookup (V, E, R) . 

int(+(A,B) ,E,R) :- int(A,E,Ra), int(B,E,Rb), R is Ra+Rb. 
lookupCV, [(V,Val) |_T] ,Val) . 

lookupCV, [(_Var,_) I t] ,R es) :- lookup(V,T,Res) . 

A typical query to the above program would be 

I ?- int(+(var(a) ,var(b)) , [(a,l) , (b,3) , (c,5)] ,Res) . 

Res = 4 ? 
yes 

Now, if at specialisation time we know the variables of the environment 
list but not their value, this would be represented by an atom to specialise 
int(+(var(a) ,var(b)) , [(a, J , (b,_) , (c,_)] ,R) . We cannot declare the en- 
vironment as static and the best we can do, given the binding types we have 
seen so far, is to declare the environment as nonvar: 

:- filter int (static, nonvar, dynamic) . 

Unfortunately, this means that logen will replace [(a,_) , (b,_) , (c,_)] by 
[_l_], hence leading to suboptimal specialisation. For example, we cannot an- 
notate lookup with unfold because the environment is an open ended list at 
specialisation time. 

7.1 Binding-Time Improvements and Bifurcation 

One way to overcome such limitations is often to rewrite the program to be 
specialised into a semantically equivalent program which specialises better, i.e., 
in which more arguments can be classified as static and/or more calls can be 
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unfolded. This process is called binding-time improvement, see, e.g., Chapter 12 
of [21], 

One simple binding-time improvement for this particular problem is to define 
an auxiliary entry point as follows: 

aux(Expr,A,B,C,Res) int(Expr, [(a,A) , (b,B) , (c,C)] ,Res) . 

Now, we can annotate the calls to int and lookup with unfold and the calls 
to is with rescall and use the following filter declaration: 

:- filter aux (static, dynamic, dynaunic, dynamic, dynamic) . 

However, this solution only works because we can completely unfold the 
predicates int and lookup. Hence, this solution is rather ad-hoc and works only 
in special circumstances. For example, if the object language supports recursive 
procedures, this will not work. 

A more principled solution, is to apply a binding-time improvement some- 
times called bifurcation [9,40]. This consists of splitting the environment into two 
parts (the static and the dynamic part) and then rewriting the interpreter ac- 
cordingly. Here, a solution is to split the environment into two lists: a static one 
containing the variable names and a dynamic list containing the actual values. 
We would then rewrite our interpreter as follows: 

:- filter int (static, static, dynamic, dynamic) . 
int(cst(C) ,_E,_E2,C) . 

int(var(V) ,E,E2,R) :- lookup (V, E, E2, R) . 

int(+(A,B) ,E,E2,R) int(A,E,E2,Ra) , int(B,E,E2,Rb) , R is Ra+Rb. 

:- filter lookup (static, static, dyneunic, dynamic) . 
lookup (V, [V|_] , [Val|_] ,Val) . 

lookup(V, [_|T] , [_|ValT] ,Res) lookup (V,T,ValT, Res) . 

One can annotate now all calls to int and lookup with unfold. It is even 
possible to annotate calls to int or to lookup (V,E,E2,R) as memo without 
loosing much specialisation as one part of the split environment is static and 
still available when specialising lookup. 

There are however several problems with this approach: 

- It can be very cumbersome and errorprone to rewrite the program. 

- For every different annotation we may have to rewrite the program in a 
different way. 

- If the dynamic and static data are not as neatly separated as above, it can 
be non-trivial to find a proper separation. 

- The final result is not always “optimal”. E.g., in the example above the 
information that the variable list and the value list must be of the same 
length is no longer explicit, resulting in a suboptimal residual program. For 
example, specialising for lookup(b, [a,b,c] , [1,X,Y] ,Res) gives 
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7.7.7. lookupCb, [a,b, c] , [1 ,X,Y] ,Res) lookup 0( [1 ,X, Y] ,Res) . 

7.7.7. lookupCb, [a, b, c] , A, B) lookup 0(A,B). 

lookup__0( [B,C|D] ,C) . 

This is less efficient than the result we will obtain later below, mainly because 
the value list has still to be deconstructed and examined at runtime (via the 
unification with [B , C I D] ) . 

LOGEN provides a better way of solving this problem by allowing its users 
to define their own annotations using what we will call binding-types. For the 
interpreter above we would like to be able to define a custom annotation de- 
scribing a list of pairs whose first element is static and the second dynamic. In 
the rest of this section we formalise and describe how this can be achieved. 

7.2 Formal Definition of Binding- Types 

In what follows, we present a polished version of the notion of a binding-type as 
introduced in [31] in order to characterise partially instantiated specialisation- 
time values in a more precise way. Like a traditional type in logic programming 
[2], a binding-type is conceptually defined as a set of terms closed under sub- 
stitution and represented by a term constructed from type variables and type 
constructors in the same way that a data term is constructed from ordinary 
variables and function symbols. However, the underlying type system is differ- 
ent from the one of Mercury used in [49] for developing binding-types where 
the right hand side of a rule consists of a number of alternatives of the form 
/(n, . . . , Tfe) with / a function symbol and the ti types. The logen user has to 
cope with untyped Prolog programs and his interest is not in well-typing them 
but in concisely expressing the relevant binding-types. Hence LOGEN allows for 
union types and for function symbols anywhere in the names of types and in the 
right hand side of type rules. To distinguish between function symbols and type 
constructors, a wrapper type/1 is used for the latter. The wrapper is ommit- 
ted for the predefined binding-types static/0, dynamic/0, nonvar/0, and list/1. 
Formally, a type is inductively defined as follows: 

Definition 2. The set of types is the least set defined by the following rules: 

- A type variable is a type. 

- static, dynamic, and nonvar are types. 

- If t is a type then list(t) is a type. 

If c/n is a type constructor different from static, dynamic, nonvar and 
list/1 and t\, . . . , Tu are types then type(c(ri, . . . ,Tn)) is a type. 

// f /n is a function symbol and t\, . . . , r„ are types then f (ti, . . . , r„) is 
a type. 

As user programs may use the predefined binding-types as function symbols, the 
need could arise to refer to these function symbols in a binding type. Therefore, 
LOGEN also provides a wrapper term/1. For example, term(static) is the type 
denoting the singleton set with the function symbol static and not the binding- 
type static. To keep the exposition simple, we have not included the term 
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wrapper in the above definition of types and we will ommit it entirely in what 
follows. 

The set of terms denoted by a type of the form f(Ti,...,r„) are all the 
terms of the form f{ti, . . . ,tn) with for all i: ti G Ti. For types of the form 
type (c (ti, . . . , r„) ) , the denotation has to be defined by a type rule. 

Definition 3. A type rule for a type constructor c of arity n is of the form: 

: - type c{Vi, . > (ti ; ... ; Tfc) . 

with k > 1, n > 0 and where V\, . . . ,Vn are distinct type variables, and t\, . . . ,Tk 
are distinct types. Any type variable occurring in the right hand side must occur 
also in the left hand side. A set of type rules is a type definition. 

With n = 0, a type rule defines a monomorphic or ground type, with n > 0, 
the type is polymorhic and the type rule defines the denotation for every type 
instance of the polymorphic type. For example the type rule corresponding with 
the predefined type list(V) is: 

type list(V) > [ ] ; [V I list(V)]. 

Every type type (c (ti, . . . , t„) ) used in the annotations of logen’s input must 
be defined, i.e., there must be a type rule with left hand side c (Vi, and, 

for all types type (r) occurring in the right hand side of the type rule, the type 
type(T{yi/ri, . . . , Ki/r„}) must be defined. 

Now we can formally define the denotations of types: 

Definition 4. [[t]], the set of terms denotated by a type t is defined as follows: 

- [[dynamic]] = {t [ t is a term}. 

- [[static]] = {t I t is o ground term}. 

- [[nonwor]] = {t [ t is a non-variable term}. 

- [[type{c{Ti , . . . , T„))]] = {t I t G [[r]] and there is a type rule of the form 

type c(Fl,...,K) > (. ..;r;...) andtG [[tIEi/ti, . . . , K/r„}]]. 

- [[/(ti,---,T„)]] = {/(ti,...,t„) I U e [[ti]] for alii}. 

- [[list{T]]] = {[]} U {[ti I t 2 ] I ti G [[r]] and t 2 G [[list{T)]]} 

Note that our definitions guarantee that types are downwards-closed (i.e., 
t G [[r]] implies t6 G [[t]]). 

A few examples are as follows: [] G [[stotic]], [] G [[[]]], [] G [[list (static)]], [] G 
[[list(dynamic)]]; s(0) G [[static]] hence [s(0)] G [[list (static)]]; X G [[dynamic]] 
and Y G ^dynamic}; hence [A, E] G ^list(dynamic)]]. 

7.3 Using Binding- Types 

The three basic binding types that are now used to control generalisation and 
filtering (the predicate generalise^nd_f liter) within the offline partial de- 
duction algorithm of Section 3.2 are as follows: 

- An argument marked as dynamic is replaced by a fresh variable and there 
will be a corresponding argument in the residual predicate. 
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- An argument marked as static is not generalised, and there will be no cor- 
responding argument in the residual predicate. 

- The top-level function symbol of an argument marked as nonvar will be 
kept, while all of its arguments are replaced by fresh variables. There will 
be one argument in the residual predicate for each argument of the top-level 
function symbol. 

- An argument marked as f (ri, . . . , r„) is basically dealt with like the nonvar 
case, except that the top-level function symbol has to be / and every sub- 
argument of / will be recursively generalised and filtered according to the 
binding- types Tj. 

- For an argument marked as type(c(ri, . . . ,r„)) the type rule of c will be 
looked at and the argument will be treated according to the body of the 
rule. For disjunctions like t \ ; T 2 the algorithm will first attempt to apply 
n , and if that is not successful it will apply T 2 . 

For example, given the declaration filter pCstatic, dynamic, nonvar) . 
the call p(a, [b] ,f (c,d)) is generalised into p(a, _,f (_, _) ) and the residual 
version of the call is of the form p__l([b] ,c,d). Given the declaration" 
filter pCstatic .dynamic ,f (static .dynamic) ) .” the call is generalised into 
p (a, f (c , _) ) and the residual version is of the form p__2 ( [b] , d) . Finally, us- 
ing filter p (static, list (dynaunic) , static) .” as filter declaration, the 
same call is generalised into p(a, [J ,f(c,d)) with the residual version being of 
the form p__3(b) . 

Let us now try to tackle the original arithmetic int/3 interpreter using the 
more refined binding- types. First, we define a new type, describing a list of pairs 
whose first element is static and whose second element is given by a parameter 
of the type constructor (so as to show how parameters can be used): 

type bind_list(X) > list((static,X)) . 

For the interpreter we can now simply provide the following filter declara- 
tions: 

:- filter int (static, type (bind_list (dynamic)) .dynamic) . 

:- filter lookup (static, type (bind_list (dynamic)) .dynamic) . 

Given these filter declarations, we can now annotate the clause bodies as 
follows: 

int(cst(C) ,_E,C) . 

int(var(V) ,E,R) :- lookup(V, E, R)). 

unfold 

int (+(A,B) ,E,R) : - int (A, E, Ra)), int(B, E, Rb)), RisRa -|- Rb). 

unfold unfold rescall 

lookup(V, [(V.Val) |_T] ,Val) . 

lookup (V, [(_Var,_) |T] ,Res) :- lookup(V, T, Res). 

' V " 

unfold 
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While these annotations and types were derived by hand, we believe that it is 
possible to derive them automatically. One approach is to adapt the polymorphic 
binding-time analysis for Mercury presented in a companion chapter [49] of this 
book. For more details see [49]. A fully automatic monomorphic binding-time 
analysis, refining earlier work in [6,31] is currently being implemented within the 
EU-funded project ASAP (see http://clip.dia.fi.upm.es/Projects/ASAP/). 

Let us now use logen to specialise the original int/3 interpreter for the 
query lookupCb, [(a, 1) , (b,X) , (c,Y)] ,Res). This results in the following spe- 
cialised code: 

V/X lookupCb, [(a, A) , (b,B) , (c,C)] ,D) lookup 0(A,B,C,D). 

lookup 0 (B , C , D , C) . 

This code is much more efficient, as linear time lookup of variable bindings 
has been replaced by basically constant time lookup in the argument list. 

Let us now specialise the interpreter for a full-fledged query: 
int(+(cst(3) ,+(+(cst(2) ,cst(5)) ,+(var(y) ,+(var(x) ,var(y))))) , 
[(a,l) , (b,2) , (x,3) , (y,4)] ,X). This produces the following satisfactory re- 
sult, where the arithmetic expression has been fully compiled into Prolog code. 

int__0(B,C,D,E,F) :- G is (2 + 5) , H is (D + E) , 

I is (E + H), J is (G + I), F is (3 + J) . 

One can see that the reduction G is (2+5) has not been performed by the 

specialiser. This shows an aspect where an online specialiser could have fared 
better, as it could have realised that, for this particular instruction, the right 
hand side of the is/2 was actually known (even though it is in general dynamic). 
Still, it is possible to instruct LOGEN to try to perform calls using the so-called 
semicall annotation [31]. Another alternative is to binding-time improve the 
program by inserting an explicit if-statement, changing the 3rd clause of the 
interpreter as follows: 

int(+(A,B) ,E,E2,R) :- int(A, E, E2, Ra), int(B, E, E2, Ra), 

'■ V " ' V ' 

unfold unfold 

( ground) (Ra, Rb)) -> RisRaj-Rb ; RisRaj-Rb). 

' ^ ' V " ' V " 

(2all caZ/ rescall 

where the if-statement itself is marked call and executed at specialisation 
time. The resulting specialised interpreter is then: 

int 0(B,C,D,E,F) :- G is (D + E) , H is (E + G) , 

I is (7 + H) , F is (3 + I) . 

7.4 Revisiting Vanilla Again 

Finally, let us present a third solution for specialising the Vanilla self-interpreter 
from Section 5.3. Indeed, we can now use the following more precise binding 
types on the original interpreter, thus ensuring that relevant information will be 
kept by the generalisation: 
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type vexp > (empty ; and (type (vexp) , type (vexp)) 

; type (predcall) ) . 

type predcall > (app(dynamic,dynajnic,dynEunic) 

; dapp (dynamic , dynamic , dynamic , dynaunic) ) . 
filter solve (type (vexp) ) . 



Given these filter declarations, we can mark the calls solve (A), solve (B) 
and clause(X,Y as unfold, and mark the call solve(Y) as memo. This will 
not give full Jones optimality, due to the bad way in which the original solve 
is written, but it will at least give much better specialisation than was possible 
using just static, dynamic, and nonvar. 

8 Lambda Interpreter 

Based on the insights of the previous section, we now tackle a more substantial 
example. We will present an interpreter for a small functional language. The in- 
terpreter still leaves much to be desired from a functional programming language 
perspective, but the main purpose is to show how to specialise a non-trivial inter- 
preter for another programming paradigm. The interpreter will use an environ- 
ment, very much like the one in the previous section, to store values for variables 
and function arguments. The full annotated source code is available with the LO- 
GEN distribution at http://www.ecs.soton.ac.uk/~mal/systems/logen.html. 

To keep things simple, we will not use a parser but simply use Prolog’s 
operator declarations to encode the functional programs. The following shows 
how to encode the Fibonacci function for our interpreter: 

op(150,fx,$) . /* to indicate variables */ 
op(150,fx,&) . /* to indicate constants */ 
op(150 ,yfx, ’===’) . /* to define functions */ 
op(150,yfx,@) . /* to do calls to defined functions */ 
op(250 ,yf X, ’ . /* for sequential composition */ 

fib === lambda (x, if ($x = &0, &1, 

if ($x = &1, &1, 

(fib ® ($x - &1) + fib @ ($x - &2))))). 

The source code of the interpreter is as shown below. As usual in functional 
programming, one distinguishes between constructors (encoded using constr/2) 
and functions (encoded using lcunbda/2). Functions can be defined statically us- 
ing the === declarations which can then be extracted using the fun/ 1 expression. 
One can use 0 as a shorthand to call such defined functions. One can introduce 
local variables using the let/3 expression. The predicate eval/3 computes the 
normal form of an expression. The rest of the code should be pretty much self- 
explanatory. To keep the code simpler, we have not handled renaming of the 
arguments of lambda expressions (it is not required for the examples we will 
deal with). 
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eval( (C) ,_Env, constr (C, [] ) ) . /* 0-ary constructor */ 

evaKconstr (C, Args) ,Env, constr (C,EArgs) ) : - l_eval(Args ,Env,EArgs) . 

eval( ’ $ ’ (VKey) ,Env, Val) /* variable */ lookup ( VKey, Env, Val ) . 

eval( ’+’ (X,Y) ,Env, constr (XY, []) ) eval (X, Env, constr (VX, [])) > 

eval(Y,Env,constr(VY, [])) , XY is VX+VY. 
eval( (X,Y) ,Env, constr (XY, []) ) eval (X, Env, constr (VX, [])) , 

eval(Y,Env,constr(VY, [])) , XY is VX-VY. 
eval( (X,Y) ,Env, constr (XY, []) ) eval (X, Env, constr (VX, [])) , 

eval(Y,Env,constr(VY, [] )) , XY is VX*VY. 
eval (let (VKey, VExpr,InExpr) , Env, Result) evaKVExpr ,Env,VVal) , 

store (Env, VKey ,VVal, InEnv) , eval(InExpr,InEnv,Result) . 
eval (if (Test , Then, Else) , Env, Res) : - eval_if (Test , Then, Else , Env, Res) . 
eval(lambda(X,Expr) ,_Env,lambda(X,Expr)) . 
eval(apply(Arg,F) ,Env,Res) eval (F, Env, FVal) , 

eval (Arg, Env, ArgVal) , eval_apply(ArgVal, FVal, Env, Res) . 
eval(fun(F) ,_,FunDef) ’===’ (F,FunDef ) . 

eval( ’ (F, Args) ,E,R) evaKapply (Args ,fun(F) ) ,E,R) . 
eval (print (X) , Env, FVal) eval(X, Env, FVal) ,print(FVal) ,nl. 
eval( ’ ’ (X, Y) ,Env,Res) /* seq. composition */ 

eval (X, Env, _) , eval (Y, Env, Res) . 

eval_apply (ArgVal, FVal , Env, Res) : - rename (FVal , Env, lambda (X,Expr) ) , 
store(Env,X,ArgVal,NewEnv) , eval(Expr ,NewEnv,Res) . 

rename(Expr ,_Env,RenExpr) RenExpr=Expr . /* sufficient for now */ 

l_eval( [] ,_E, [] ) . 

l_eval([H|T] ,E, [EHIET]) eval(H,E,EH) , l_eval(T,E,ET) . 

eval_if (Test , Then, _Else , Env, Res) test (Test, Env) , !, eval (Then, Env, Res) . 

eval_if (_Test ,_Then, Else , Env, Res) ) eval(Else,Env,Res) . 

test ( ’=’ (X, Y) ,Env) eval(X,Env,VX) , eval (Y, Env, VX) . 

store ([] , Key, Value, [Key/Value]), 
store ( [Key/_Value2 I T] ,Key, Value , [Key/Value IT]), 
store ( [Key2/Value2 I T] , Key , Value , [Key2/Value2 I BT] ) : - 

Key\==Key2 , store (T , Key , Value , BT) . 

lookup (Key, [Key/Value I _T] , Value) . 
lookup(Key, [Key2/_Value2 I T] , Value) 

Key\==Key2,lookup(Key,T, Value) . 

Handling the Cut. One may notice that the above program does use a cut in 
the code for eval_if . Previous version of logen did not support the cut, but it 
turns out that specialising the cut is actually very easy to do: basically all one 
has to do is to simply mark the cuts using either the call or rescall annotations 
we have already encountered. It is up to the binding time analysis to ensure that 
this is sound, i.e., one has to ensure that: 
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- If a cut is marked call, then whenever it is reached and executed at special- 
isation time the calls to the left of the cut will never fail at runtime. 

- If a cut is marked as rescall within a predicate p, then no calls to p are 
unfolded. One can relax this condition somewhat, e.g., one may to be able 
to unfold such a predicate p if all computations are deterministic (like in our 
functional interpreter) but one has to be very careful when doing that. 

These conditions are sufficient to handle the cut in a sound, but still useful 
manner. Details about handling the cut in an online specialiser can be found in 
[41,43]. 



Annotations. To be able to specialise this interpreter we need the power of 
logen’s binding types. The structure of the environment is much like in the 
previous section, but here we have more information about the structure of values 
that the interpreter manipulates and stores. Basically, values are encoded using 
constr/2, whose first argument is the symbol of the constructor being encoded 
and the second argument is a list containing the encoding of the arguments. A 
lambda expression is also a valid value. 

type value_expression = 

(constr (dynamic , list (type (value_expression) ) ) ; 
lambda(static , static) ) . 

type env = list( static / type (value_expression) ) . 

We can now annotate the calls of our program. Basically, all built-ins have 
to be marked rescall but all user calls can be marked as unfold except for the 
call eval_apply(ArgVal,FVal,Env,Res). We thus supply the following filter 
declaration: 

type result = ( type (value_expression) ; dynamic). 

filter eval_apply (type (result) ,type(result) ,type(env) .dynamic) . 

Note that we use a union type for result, because often (but not always) 
we will have partial information about the result types. Union types are thus 
a way to allow logen to make some online decisions: during specialisation it 
will check whether the first and second argument of eval_apply match the 
value_expression type and it will treat the arguments as dynamic (the sec- 
ond alternative in the type result) when they do not. 



Experiments When specialising this program for, e.g., calling the fib function 
we get something very similar to the (naive) fibonacci program one would have 
written in Prolog in the first place: 

y,y, eval_apply(constr(A, [] ) ,lambda(x,if ($x= &0,&l,if($x= 
y,y. fib@($x- &l)+fib@($x- &2) ) ) ) , [x/constr (B , [] )] ,C) :- 

y,y, eval_apply 2(A,B,C). 

eval_apply 2 (O.B, constr ( 1, [] )) :- !. 
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eval_apply 2(l,B,constr(l, [] )) !. 

eval_apply 2 (B , C , constr (D , [] ) ) : - 

E is (B - 1) , eval_apply 2 (E,B, constr (F, [])) , 

G is (B - 2) , eval_apply 2 (G,B, constr (H, [])) , D is (F + H) . 

This specialised code runs about 14 times faster than the original, and even 
when including the specialisation time, i.e., the time to run logen and the 
generating extension, the specialised program is still 7 times faster than running 
the original program. Full details of this experiment can be found in Table 2. 

Furthermore, the experiments described below indicate that speedups are 
getting bigger for more complicated object programs with more functions and 
more arguments and variables. One reason being that more complicated object 
programs will have more variables, and hence looking up variable values in the 
list environment will get more and more expensive, whereas lookup in the spe- 
cialised program will be basically a constant time operation (relevant variables 
are arguments of the specialised predicates). Indeed, the results of specialising 
the interpreter for the following slightly bigger functional program that has extra 
loop variables results in bigger speedups. 

loop_fib === lELmbda(cur,let(curl,$cur + &1, let(cur2, $curl + &1, 
letCcurS, $cur2 + &1, if(($cur = &21), 

(fib 0 ($cur) ) , 

(print (constr (fibonacci , [$cur, fib 0 ($cur)])) 

-> (loop_fib @ ($curl) ))))))) . 

In the same table one can see figures for loopJib2, loopJibS, loop_fib4, 
loopJibS, each with 3 more variables in the environment than its predecessor, 
but apart from that behaving identically to loopJib. As can be seen, the spe- 
cialised programs basically all run in the same time (60-70 ms), whereas the 
original interpreter runs considerably slower with more variables, increasing the 
speedup to 45 for loopJibS. 

Note that logen has only to be run once for the eval interpreter; the same 
generating extension can then be used for specialising the interpreter with respect 
to any functional program. Similarly, the specialised code can then be used for 
any call to the given functional program.® 

9 Discussion and Conclusion 

Probably the most closely related work is [20] which treats untyped first-order 
functional languages, and gives a list of recommendations on how to write in- 
terpreters that specialise well. Even though [20] does of course not address the 
specific issues that arise when specialising logic programming interpreters, many 

® In the speedup figures we suppose that the time needed for consulting is the same for 
the original and specialised program. In our experiments consulting the specialised 
program was actually slightly faster, but this may not always be the case. 
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Table 2. Specialising eval using logen 



function 

call 


eval 

runtime 


logen 

time 


genex 

time 


specialised 

runtime 


speedup 


speedup 
(inch gx) 


speedup 
(incl. gx, logen) 


fib(24) 


1050 ms 


60 ms 


15ms 


75 ms 


14.0 


11.7 


7 


loopJib(O) 


1430 ms 


60 ms 


30ms 


60 ms 


23.8 


15.9 


9.5 


loop_fib2(0) 


1940 ms 


60 ms 


40ms 


60 ms 


32.3 


19.4 


12.1 


o' 

CO 

a 

o 

o 


2460 ms 


60 ms 


50ms 


60 ms 


41.0 


22.4 


14.5 


loop_fib4(0) 


2540 ms 


60 ms 


50ms 


70 ms 


36.3 


21.2 


14.1 


loop_fib5(0) 


3150 ms 


60 ms 


60ms 


70 ms 


45.0 


24.2 


16.6 



points raised in [20] are also valid in the logic programming setting. For example, 
[20] suggests that you should “Write your interpreter compositionally” which is 
exactly what we have done for our lambda interpreter in Section 8 and which 
makes it much easier to ensure termination of the specialisation process. [20] also 
warns of “data structures that contain static data, but can grow unboundedly 
under dynamic control” (such as a stack) . The environment in the lambda inter- 
preter contained static data but its length was fixed and so caused no problem; 
however if we were to add an activation stack to our interpreter in Section 8 we 
would have to resort to the recipes suggested in [20] . 

We have already discussed related work in the logic programming commu- 
nity [42,47,44,5,7,26,50,28]. In the functional community there has been a lot 
of recent interest in Jones optimality; see [19,36,46,13]. For example, [13] shows 
theoretically the interest of having a Jones-optimal specialiser and the results 
should also be relevant for logic programming. 

As far as future work is concerned, the most challenging topic is probably 
to provide a fully automatic binding-time analysis. As already mentioned, the 
binding-time analysis in [49] may prove to be a good starting point. Still, it is 
likely that at least some user intervention will be required in the foreseeable 
future to specialise more complicated interpreters. 

Another avenue for further investigation is to move from interpreters to pro- 
gram transformers and analysers. A particular kind of program transformer is 
of course a partial evaluator, and one may wonder whether we can specialise, 
e.g., the code from Section 3. Actually, it turns out we can now do this and, 
surprisingly or not, the specialised specialisers we obtain in this way are quite 
similar to the one generated by logen directly. This issue is investigated in [8], 
proving some first encouraging results. 

In conclusion, we have shown how to use offline specialisation in general 
and LOGEN in particular to specialise logic programming interpreters. We have 
shown how to obtain Jones-optimality for simple self-interpreters, as well as for 
more involved interpreter such as a debugger. We have also shown how to spe- 
cialise interpreters for other programming paradigms, using more sophisticated 
binding- types. We have also presented some experimental results, highlighting 
the speedups that can be obtained, and showing that the logen system can 
be a useful basis for generating compilers for high-level languages. Indeed, we 
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soon hope to be able to apply logen to derive a compiler from the interpreter 
in [30], and then compiling high-level B specifications into Prolog code for fast 
animation and verification. 
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Abstract. The procedural interpretation of logic programs and queries 
is parametric to the selection rule, i.e. the rule that determines which 
atom is selected in each resolution step. Termination of logic programs 
and queries depends critically on the selection rule. In this survey, we 
present a unified view and comparison of seven notions of universal ter- 
mination considered in the literature, and the corresponding classes of 
programs. For each class, we focus on a sufficient, and in most cases even 
necessary, declarative characterisation for determining that a program is 
in that class. By unifying different formalisms and making appropriate 
assumptions, we are able to establish a formal hierarchy between the 
different classes and their respective declarative characterisations. 



1 Introduction 

The paradigm of logic programming originates from the discovery that a frag- 
ment of first-order logic can be given an elegant computational interpretation. 
Kowalski [40] advocates the separation of the logic and control aspects of a logic 
program and has coined the famous formula 

Algorithm = Logic -I- Control. 

The programmer should be responsible for the logic part, and hence a logic 
program should be a (first-order logic) specification. The control should be taken 
care of by the logic programming system. One aspect of control in logic programs 
is the selection rule. This is a rule stating which atom in a query is selected in 
each derivation step. It is well-known that soundness and completeness of SLD- 
resolution is independent of the selection rule [2]. However, a stronger property 
is usually required for a selection rule to be useful in programming, namely 
termination. 

Definition 1.1. A terminating control for a program P and a query Q is a 
selection rule s such that every SLD-derivation of P and Q via s is finite. 
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In reality, logic programming is far from the ideal that the logic and con- 
trol aspects are separated. Without the programmer being aware of the control 
and writing programs accordingly, logic programs would usually be hopelessly 
inefficient or even non-terminating. 

The usual selection rule of early systems is the LD selection rule: in each 
derivation step, the leftmost atom in a query is selected for resolution. This 
selection rule is based on the assumption that programs are written in such a 
way that the data flow within a query or clause body is from left to right. Under 
this assumption, this selection rule is usually a terminating control. For most 
applications, this selection rule is appropriate in that it allows for an efficient 
implementation. 

Second generation logic programming languages allow for dynamic schedul- 
ing, i.e. they have primitives for addressing logic and control separately. Program 
clauses have their usual logical reading. In addition, programs are augmented 
by delay declarations or annotations that specify restrictions on the admissible 
selection rules. These languages include NU-Prolog [74] and Godel [38]. 

In this survey, we classify programs and queries according to the selection 
rules for which they terminate, hence investigating the influence of the selection 
rule on termination. Like most approaches to the termination problem, we are 
interested in universal termination of logic programs and queries, that is, show- 
ing that all derivations for a program and query (via a certain selection rule) are 
finite. This is in contrast to existential termination [10,23,48]. Also, we consider 
definite logic programs, as opposed to logic programs that also contain negated 
literals in clause bodies. 

Figure 1 gives an overview of the classes we consider. Arrows drawn with 
solid lines stand for set inclusion (“^ corresponds to C”). The numbers in the 
figure correspond to statements and examples related to the pair of classes in 
question. 

A program P and query Q strongly terminate if they terminate for all se- 
lection rules. This class of programs has been studied mainly by Bezem [11]. 
Naturally, this class is the smallest we consider. A program P and query Q left- 
terminate if they terminate for the LD selection rule. The vast majority of the 
literature is concerned with this class; see [23] for an overview. A program P and 
query Q 3-terminate if there exists a selection rule for which they terminate. This 
notion of termination has been introduced by Ruggieri [62,63]. Surprisingly, this 
is still not the largest class we consider. Namely, there is the class of programs for 
which there are only finitely many suecessful derivations (although there could 
also be infinite derivations). We say that these programs have bounded nonde- 
terminism, a notion studied by Pedreschi & Ruggieri [58]. Such programs can 
be transformed into equivalent programs which strongly terminate, as indicated 
in the figure and stated in Theorem 10.11. 

The three remaining classes shown in the figure are related to dynamic 
scheduling, i.e. selection rules where the selection of an atom depends on its 
degree of instantiation at runtime. To explain these classes and their relation- 
ship with left-terminating programs, we have to introduce the concept of modes. 
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Inference+transformation 



Bounded programs 

(Bounded nondeterminism) 

10.1, Ex. 9.1 

Fair-bounded programs 

(3 - Termination ) 

/ 10.1, Ex. 8.1 

^ \ 

^ '' \ 
10.2, Ex. 8.3 / Acceptable programs ' 

(Left-termination) ' 

10.3, Exs..?'!", 7.4 

/ ^ 

Delay-recurrent programs 




'10.4, Ex. 8.3 



\ - - - _ 10.9, Exs. 6.1, 6.10 

' ' 

10.5 E:! 0 . 7.3 '' Simply IP-acceptable 

’ ' ' programs 

(Input V -termination) 
xs. 5.1, 5.6 




Simply acceptable programs 

(Input termination) 



^Exs. 4.1, 4.13 



Recurrent programs 

(Strong termination) 



Fig. 1. An overview of the classes 



A mode is a labelling of each argument position of a predicate as either input or 
output. It indicates the intended data flow in a query or clause body. 

An input- consuming derivation is a derivation where an atom can be selected 
only when its input arguments are instantiated to a sufficient degree, so that 
unification with the head of the clause does not instantiate them further. A 
program and a query input terminate if all input-consuming derivations for this 
program and query are finite. This class of programs has been studied by Smaus 
[67] and Bossi et al. [15,16,17]. 

Input-consuming derivations can be restricted by imposing some additional 
instantiation property V that each selected atom must have. For example, V 
might be the set of all atoms that are bounded w.r.t. a given level mapping. A 
program and a query input V -terminate if all input-consuming derivations for 
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this program and query, restricted by V , are finite. This class of programs has 
been studied by Smaus in very recent work [68] . 

A local selection rule is a selection rule specifying that an atom can only be 
selected if there is no other atom which was introduced (by resolution) more 
recently. Marchiori & Teusink [47] have studied termination for selection rules 
that are both local and delay-safe, i.e. they respect the delay declarations. We 
will call termination w.r.t. such selection rules local delay termination. 

A priori, the LD selection rule, input-consuming selection rules (possibly re- 
stricted by a property V) and local delay-safe selection rules are not formally 
comparable. Under reasonable assumptions however, one can say that assuming 
input-consuming selection rules is weaker than assuming local and delay-safe se- 
lection rules, which is again weaker than assuming the LD selection rule. While 
assuming input-consuming selection rules is trivially (though not necessarily 
strictly) weaker than assuming input-consuming selection rules with an addi- 
tional property V , there is little sense in making general comparisons between 
selection rules restricted by some V and the other classes — it depends on the V. 
However, we can choose V so that it exactly captures delay-safe selection rules, 
and then it follows of course that assuming U-selection rules is weaker than as- 
suming local and delay-safe selection rules. All these inclusions that depend on 
additional assumptions are indicated in the figure by dashed lines. Again, the 
numbers in the figure correspond to statements and examples. 

In this survey, we present declarative characterisations of the classes of pro- 
grams and queries that terminate with respect to each of the mentioned notions 
of termination. The characterisations make use of level mappings and Herbrand 
models in order to provide proof obligations on program clauses and queries. All 
characterisations are sound. Except for the cases of local delay termination and 
input U-termination, they are also complete (in the case of input termination, 
this holds only under certain restrictions). 

This survey is organised as follows. The next section introduces some basic 
concepts and fixes the notation. Then we have seven sections corresponding to 
the seven classes in Fig. 1, defined by increasingly strong assumptions about 
the selection rule. In each section, we introduce a notion of termination and 
provide a declarative characterisation for the corresponding class of terminating 
programs and queries. In Sec. 10, we establish relations between the classes, 
formally showing the implications of Fig. 1. Section 11 discusses the related 
work, and Sec. 12 concludes. 



2 Background and Notation 

We use the notation of Apt [2], when not otherwise specified. In particular, 
throughout this article we consider a fixed language L in which programs and 
queries are written. All the results are parametric with respect to L, provided 
that L is rich enough to contain the symbols of the programs and queries under 
consideration. 
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We denote by Ul (resp., Bl) the Herbrand universe (resp., base) on L. We 
denote by TerniL (resp., AtoniL) the set of terms (resp., atoms) on L. We use 
typewriter font for logical variables, e.g. X,Ys, upper case letters for arbitrary 
terms, e.g. Xs, and lower case letters for ground terms, e.g. t, x, xs. We denote by 
instL{P) {groundL{P)) the set of (ground) instances of all clauses in P that are 
in language L. The notation ground,L{Q) for a query Q is defined analogously. 
The domain (resp., set of variables in the range) of a substitution 6 is denoted 
as Dom{9) (resp., Ran{9)). 

The set {1, . . . , n} is denoted by [1, n]. 



2.1 Modes 

For a predicate p/n, a mode is an atom p{m\, . . . ,mn), where mi G {I, 0} for 
i G [l,n]. Positions with I are called input positions, and positions with O are 
called output positions of p. To simplify the notation, an atom written as p(s,t) 
means: s is the vector of terms filling in the input positions, and t is the vector 
of terms filling in the output positions. An atom p(s,t) is input-linear if s is 
linear, i.e. each variable occurs at most once in s. The atom is output-linear if t 
is linear. A mode for a program consists of a mode for each of its predicates. 

In the literature, several correctness criteria concerning the modes have been 
proposed, most importantly nicely-modedness and well-modedness [2]. In this 
article, we need simply moded programs [4] and well moded programs. The 
former are a special case of nicely moded programs. Note that the use of the 
letters s and t is reversed for clause heads. We believe that this notation naturally 
reflects the data flow within a clause. 

Definition 2.1. A clause p(to,s„_|_i) ^Pi(si,ti), . . . ,p„(s„,t„) is simply 
moded if ti, . . . , t„ is a linear vector of variables and for all i G [1, n] 

i 

Var{ti) n Far(to) = 0 and Var{ti) n [J Var{sj) = 0 . 

i=i 

A query B is simply moded if the clause p ^ B is simply moded, where p/0 
is a fresh predicate symbol. A program is simply moded if all of its clauses are. 

A query (clause, program) is permutation simply moded if it is simply moded 
modulo reordering of the atoms of the query (each clause body) . 

Thus, a clause is simply moded if the output positions of the body atoms are 
filled in by distinct variables, and every variable occurring in an output position 
of a body atom does not occur in an earlier input position. In particular, every 
unit clause is simply moded. 

Definition 2.2. A query Q = pi(si,ti), . . . ,p„(s„,t„) is well moded if for all 
i G [1, n] and AT = 1 

i-l 

Vars(si) C Vars(tj) 

j=K 



( 1 ) 
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The clause p(to,s„+i) ^ Q is well moded if (1) holds for all f € [l,n + 1] and 
K = 0. A program is well moded if all of its clauses are. 

A query (clause, program) is permutation well moded if it is well moded 
modulo reordering of the atoms of the query (each clause body) . 

Almost all programs we consider in this article are permutation well and 
simply moded with respect to the same set of modes. The program in Fig. 9 
is an exception due to the fact that our notion of modes cannot capture that 
sub-arguments of a term can have different modes. We do not always give the 
modes explicitly, but they are usually easy to guess. 

Conceptually, we assume that whenever modes are used in this article, the 
mode of a predicate is unique. To realise the use of one predicate in several modes, 
one can introduce multiple (renamed) versions of the predicate [4,5,32,55]. But 
it is also possible to realise multiple modes without any actual code duplication. 
Then, a mode should be associated with each occurrence of a predicate in a 
program [66,69]. 

2.2 Norms and Level Mappings 

All the characterisations of terminating programs we propose make use of the 
notions of norm and level mapping [20] . Depending on the approach, such notions 
are defined on ground or arbitrary objects. 

In the following definition, Termi/~ denotes the set of equivalence classes 
of terms modulo variance. Similarly, we define Atomr/'^- 

Definition 2.3. A norm is a function j.j : C//, ^ N. A level mapping is a func- 
tion j.j : Bl — > N. For a ground atom A, \A\ is called the level of A. 

An atom A is bounded w.r.t. the level mapping j.j if there exists /c G N such 
that for every A G groundriA), we have k > jA'j. 

A generalised norm is a function j.j : Termr/^ ^ N. A generalised level 
mapping is a function j.j : Atomr/^ IN . Abusing notation, we write |T| (|A|) 
to denote the value of | . | on the equivalence class of the term T (the atom A) . 

(Generalised) level mappings are used to measure the “size” of a query and 
show that this size decreases along a derivation, hence showing termination. 
They are usually defined based on (generalised) norms. 

Of course, a generalised norm or level mapping can be interpreted as an 
ordinary norm or level mapping by restricting its domain to ground objects. 
Therefore, we now give some examples of generalised norms and level mappings. 
One commonly used generalised norm is the term size norm, defined as 

size{f{Ti , . . . , T„ )) = 1 -I- size{Ti) -|- . . . -1- size{Tn) if n > 0 

size{T) = 0 if T constant/ variable. 

Intuitively, the size of a term T is the number of function symbols occurring in 
T, excluding constants. Another widely used norm is the list-length function, 
defined as 

length{\T\Ts\) = 1-1- length(Ts) 

length{T) = 0 if T yf [. . . | . . .]. 
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In particular, for a nil-terminated list [Ti, . . . ,T„], the list-length is n. We call 
a term of the form [Ti, . . . , T„|Ts], where n > 0, an open list. In particular, any 
variable is an open list. 

We will see later that usually, level mappings measure the input arguments of 
a query, even though this is often just an intuitive understanding and not explicit. 
Moreover, the choice of a particular selection rule often reflects a particular mode 
of the program. In this sense, the choice of the level mapping must depend on 
the selection rule, via the modes. This will be seen in our examples. 

However, apart form the dependency just mentioned, the choice of level map- 
ping is an aspect of termination which is rather independent from the choice of 
the selection rule. In particular, one does not And any interesting relationship 
between the underlying norms and the selection rule. This is why the detailed 
study of various norms and level mappings is beyond the scope of this article, 
although it is an important aspect of automated proofs of termination [14,27]. 

We now define level mappings where the dependency on the modes is made 
explicit [32]. 

Definition 2.4. A moded (generalised) level mapping [.[ is a (generalised) level 
mapping such that for any (not necessarily) ground s, t and u, |p(s,t)| = 
|p(s,u)|. 

The condition |p(s,t)| = |p(s,u)| states that the level of an atom is indepen- 
dent from the terms in its output positions. 

2.3 Selection Rules 

Let INIT be the set of initial fragments of SLD-derivations in which the last 
query is non-empty. The standard definition of selection rule is as follows: a 
selection rule is a function that, when applied to an element in INIT, yields an 
occurrence of an atom in its last query [2]. In this article, we assume an extended 
definition: we also allow that a selection rule may select no atom (a situation 
called deadlock), and we allow that it not only returns the selected atom, but 
also specifies the set of program clauses that may be used to resolve the atom. 
Whenever we want to emphasise that a selection rule always selects exactly one 
atom together with the entire set of clauses for that atom’s predicate, we speak 
of a standard selection rule. Note that for the extended definition, completeness 
of SLD-resolution is lost in general. Selection rules are denoted by s. 

In practice, selection rules should always be computable functions, but we 
are not concerned with this issue here. 

We now define the various notions of selection rules used in this article. 

A 7^-selection rule is a selection rule where each selected atom is in some set 
of atoms V, closed under instantiation. Note that this notion is very abstract, 
but this does not mean that every selection rule can be defined as a 7^-selection 
rule. 

Definition 2.5. Input-consuming selection rules are defined w.r.t. a given mode. 
A selection rule s is input- consuming for a program P if either 
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— s selects an atom p(s,t) and a non-empty set of clauses of P such that 
p(s, t) and each head of a clause in the set are unifiable with an mgu a, and 
Dom{a) n Vors(s) = 0, or 

— s selects an atom p(s,t) that unifies with no clause head from P, together 
with all clauses in P (this models failure), or 

~ if the previous cases are impossible, s selects no atom (i.e. we have deadlock). 

A selection rule is delay-safe w.r.t. a level mapping |.| if it specifies that an 
atom A can be selected only when A is bounded w.r.t. |.|.^ 

Consider a query, containing atoms A and B, in an initial fragment ^ of a 
derivation. Then A is introduced more recently than B if the derivation step 
introducing A comes after the step introducing B, in A local selection rule is 

a selection rule that specifies that an atom in a query can be selected only if 
there is no more recently introduced atom in the query. 

The usual LD selection rule (also called leftmost selection rule) always selects 
the leftmost atom in the last query of an element in INIT . The RD selection 
rule (also called rightmost) always selects the rightmost atom. 

A standard selection rule s is fair if for every SLD-derivation f via s either 
f is finite or for every atom A in (some further instantiated version of) A is 
eventually selected. 

2.4 Universal Termination 

In general terms, the problem of universal termination of a program P and a 
query Q w.r.t. a set of selection rules consists of showing that every rule in the 
set is a terminating control for P and Q. 

Definition 2.6. A program P and a query Q universally terminate w.r.t. a set 
of selection rules S if every SLD-derivation of P and Q via any selection rule 
from S is finite. 

Note that, since SLD-trees are finitely branching, by Konig’s Lemma, “every 
SLD-derivation for P and Q via a selection rule s is finite” is equivalent to stating 
that the SLD-tree of P and Q via s is finite. 

We say that a class of programs and queries is a sound characterisation of 
universal termination w.r.t. S if every program and query in the class universally 
terminate w.r.t. S. Conversely, it is complete if every program and query that 
universally terminate w.r.t. S are in the class. 

2.5 Models 

Several of the criteria for termination we consider rely on information supplied 
by a model of the program under consideration. We provide the definition of 
Herbrand interpretations and models [2]. 

® The reader may be surprised that delay-safe selection rules make no reference to 
delay declarations. This is a terminological shortcut. 
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A Herhrand interpretation / is a set of ground atoms. A ground atom A 
is true in I, written I \= A, \i A £ I . This notation is extended to ground 
queries in the obvious way. / is a Herbrand model of program P if for each 
A ^ Bi , . . . ,Bn & groundriP), have that / |= Bi, . . . , implies I \= A. 

When speaking of the least Herbrand model of P, we mean least w.r.t. set 
inclusion. In termination analysis, it is usually not necessary to consider the least 
Herbrand model, which may be difficult or impossible to determine. Instead, one 
uses models that capture some argument size relationship between the arguments 
of each predicate [23]. For example, a model for the usual append predicate is 

{append(a:s, ys, zs) | length(zs) = length{xs) + length{ys)}. 

3 Strong Termination 

3.1 Operational Definition 

Early approaches to the termination problem treated universal termination 
w.r.t. all selection rules, called strong termination. Generally speaking, strongly 
terminating programs and queries are either very trivial or especially written for 
theoretical considerations. 

Definition 3.1. A program P and query Q strongly terminate if they univer- 
sally terminate w.r.t. the set of all selection rules. 

3.2 Declarative Characterisation 

In the following, we recall the approach of Bezem [11], who defined the class of 
recurrent programs and queries. Intuitively, a program is recurrent if for every 
ground instance of a clause, the level of the body atoms is smaller than the level 
of the head. 

Definition 3.2. Let j.j be a level mapping. 

A program P is recurrent by j.j if for every A ^ Bi , . . . , e groundriP)'- 

for i G [1, n] [A] > \B^\. 

A query Q is recurrent by j.j if there exists A: G IN such that for every Ai , . . . , A„ 
G groundriQ)- 

for z G [1, nj k > |Aj|. 

In the above definition, the proof obligations for a query Q are derived from 
those for the program {p <— Q}, where p/0 is a fresh predicate symbol. Intuitively, 
this is justified by the fact that the termination behaviour of the query Q and a 
program P is the same as for the query p and the program P U {p ^ Q}. So k 
plays the role of the level of the atom p. In the original work [11], the query was 
called bounded. Throughout the paper, we prefer to maintain a uniform naming 
convention both for programs and queries. 

Termination properties of recurrent programs are summarised in the follow- 
ing theorem. 
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"/o sat (Formula) ^ 

"/, there is a true instance of Formula 

sat (true) . 
sat(X A Y) ^ 

sat (X) , sat (Y) . 
sat (not X) ^ inval(X) . 

Fig. 2. SAT 

Theorem 3.3 ([11]). Let P be a program and Q a query. 

If P and Q are both recurrent by some |.|, then they strongly terminate. 
Conversely, if P and every ground query strongly terminate, then P is recur- 
rent by some level mapping |.|. If in addition P and Q strongly terminate, then 
P and Q are both recurrent by some level mapping |.|. 

Proof. The result is shown in [11] for standard selection rules. It easily extends to 
our generalisation of selection rules by noting that P and Q strongly terminate iff 
they universally terminate w.r.t. the set of standard selection rules. The only-if 
part is immediate. The if-part follows by noting that a derivation via an arbitrary 
selection rule is a (prefix of a) derivation via a standard selection rule. 

3.3 Examples 

Example 3.4- The program SAT in Fig. 2 decides propositional satisfiability. The 
program is readily checked to be recurrent by |.|, where we define 

|sat(f)| = |inval(t)| = sizeft). 

Note that Def. 3.2 imposes no proof obligations for unit clauses. The query 
sat(X) is recurrent iff there exists a natural k such that for every ground 
instance x of X, we have that size{x) is bounded by k. Obviously, this is the case 
iff X is already a ground term. For instance, the query sat (not (true) A false) 
is recurrent, while the query sat(f alse A X) is not. 

Note that the choice of an appropriate level mapping depends on the intended 
mode of the program and query. Even though this is usually not explicit, level 
mappings measure the size of the input arguments of an atom [32]. 

Example 3.5. Figure 3 shows the APPEND program. It is easy to check that 
APPEND is recurrent by the level mapping |append(xs, ys, zs)| = length{xs) and 
also by |append(a;s, ys, zs)| = length{zs). A query append(As, Ys,Zs) is recur- 
rent by the first level mapping iff Xs is anything other than an open list, and by 
the second iff Zs is anything other than an open list. The level mapping 

|append(a;s, ys, zs)| = min{length{xs), length(zs)} 

combines the advantages of both level mappings. APPEND is easily seen to be 
recurrent by it, and if Xs or Zs is anything other than an open list, then 
append) As, Ys,Zs) is recurrent by it. 



inval (false) . 

invaKX A Y) ^ inval (X) . 
inval (X A Y) «— inval (Y) . 
inval (not X) ^ sat(X). 
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"/o reverse (Xs ,Ys) ^ 

"/, Xs is the reverse of list Ys. 

reverse ( [X I Xs] ,Ys) ^ 
appendCZs, [X] ,Ys) , 
reverse(Xs,Zs) . 
reverse ([],[]). 



"/, append(Xs,Ys,Zs) ^ 

’/, Zs is the result of concatenating 
"/. lists Xs and Ys. 

appendC [X I Xs] , Ys , [X I Zs] ) <— 

append(Xs,Ys,Zs) . 
append ( [] , Ys , Ys ) . 



Fig. 3. APPEND and NAIVE_REVERSE 



"/o even(X) ^ 

"/, X is an even natural number. 

even(s(s(X) ) ) «— even(X) . 

even(O) . 



•/. lte(X,Y) ^ 

’/, X,Y are natural numbers 

’/, s.t. X is smaller or equal than Y 

lte(s(X) ,s(Y)) ^ lte(X,Y). 

lte(0,Y) . 



Fig. 4. EVEN 



3.4 On Completeness of the Characterisation 

Note that completeness is not stated in full general terms, i.e. recurrence is not 
a complete proof method for strong termination. Informally speaking, incom- 
pleteness is due to the use of level mappings, which are functions that must 
specify a value for every ground atom. Therefore, if P strongly terminates for a 
certain ground query Q but not for all ground queries, we cannot conclude that 
P is recurrent. We provide a general completeness result in Sec. 7 for a class of 
programs containing recurrent programs. 



4 Input Termination 

In this section, we consider input-consuming selection rules [17]. 

We have said above that the class of strongly terminating programs and 
queries is very limited. Even if a program is recurrent, it may not strongly 
terminate for a query of interest since the query is not recurrent. 

Example 4-1- The program EVEN in Fig. 4 is recurrent by defining 

|even(a;)| = size(x) 

|lte(x,y)| = size{y). 

Now consider the query Q = even(X), lte(X, s^®^(0)), which is supposed to 
compute the even numbers not exceeding 100. By always selecting the leftmost 
atom, one can easily obtain an infinite derivation for EVEN and Q. As a conse- 
quence of Theorem 3.3, Q is not recurrent. 
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4.1 Operational Definition 

Definition 4.2. A program P and query Q input terminate if they universally 
terminate w.r.t. the set consisting of the input-consuming selection rules. 

The requirement of input-consuming derivations merely reflects the very 
meaning of input: an atom must only consume its own input, not produce it. 
In existing implementations, input-consuming derivations can be ensured using 
control constructs such as delay declarations [38,70,73,74]. 

In the above example, the obvious mode is even(/), lte(0,/). With this 
mode, we will show that EVEN and Q input terminate. If we assume a selection 
rule that is input-consuming while always selecting the leftmost atom if pos- 
sible, then the above example is a contrived instance of the generate- and-test 
paradigm. This paradigm involves two procedures, one which generates a set of 
candidates, and another which tests whether these candidates are solutions to 
the problem. The test occurs to the left of the generator so that tests take place 
as soon as possible, i.e. as soon as sufficient input has been generated for the 
derivation to be input-consuming. 

Proofs of input termination differ from proofs of strong termination in an 
important respect. For the latter, we require that the initial query is recurrent, 
and as a consequence we have that all queries in any derivation from it are 
recurrent (we say that recurrence is persistent under resolution). This means 
that, at the time an atom is selected, the depth of its SLD-tree is bounded. In 
contrast, input termination does not have such a strong requirement on each 
selected atom. 

Example 4-3. Consider the EVEN program in Fig. 4 and the following input- 
consuming derivation, where we underline the selected atom in each step: 

even(X), lte(X, s^^®(0)) > even(s(X')), lte(X', s®®(0)) — > 

even(s(s(X"))), lte(X", s®*(0)) > even(X"), lte(X", s®'®(0)) . . . 

At the time when even(s(s(X"))) is selected, the depth of its SLD-tree is not 
bounded. In fact, this depth depends on the eventual instantiation of X". 

The method for showing input termination inherently relies on a notion of 
level for atoms such as even(s(s(X"))), although this level is not bounded. This 
is the key to showing termination for derivations with coroutining (interleaving 
subderivations). In contrast, most approaches to termination assume that the 
level of the selected atom is bounded. We refer to Subsec. 11.7 and [66, Sec. 11.1]. 

4.2 Information on Data Flow: Simply-Local Substitutions and 
Models 

Since the depth of the SLD-tree of the selected atom depends on further instan- 
tiation of the atom, it is important that programs are well-behaved w.r.t. the 
modes. This is illustrated in the following example. 
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Example 4-4- Consider the APPEND program (Fig. 3) in mode append(/, I, O) 
and the query 

append([l|As], [],Bs), append(Bs, O 7 AS). 

Then we have the following infinite input-consuming derivation: 

append([l|As], [], Bs), append(Bs, [], As) — > 
append(As, [], Bs'), append([l|Bs'], [], As) — > 
append([l|As'], [], Bs'), append(Bs', [], As') > ... 

This well-known termination problem of programs with coroutining has been 
identified as circular modes by Naish [55]. 

To avoid the above situation, we require programs and queries to be simply 
moded (see Subsec. 2.1). 

We now define simply-local substitutions, which reflect the way simply moded 
clauses become instantiated in input-consuming derivations. Given a clause c = 
p(to,s„_|_i) ^Pi(si,ti), . . . ,p„(s„,t„) used in an input-consuming derivation, 
first to becomes instantiated, and the range of that substitution contains only 
variables from outside of c. Then, by resolving pi(si, ti), the vector ti becomes 
instantiated, and the range of that substitution contains variables from outside 
of c in addition to variables from Si. Continuing in the same way, finally, by 
resolving p„(s„,t„), the vector t„ becomes instantiated, and the range of that 
substitution contains variables from outside of c in addition to variables from 
Si . . .s„. A substitution is simply-local if it is composed from substitutions as 
sketched above. We now give the formal definition [17]. 

Definition 4.5. A substitution 6 is simply-local w.r.t. the clause c = p(to,s„_|_i) 
^Pi(si,ti), ..., p„(s„,t„) if there exist substitutions cto, cti . . . , <t„ and dis- 
joint sets Voj Vi,. . . ,Vn consisting of of fresh (w.r.t. c) variables such that 9 = 
fToCTi • • • cr„ where for i e {0, . . . , n}, 

— Dom{ai) C Vars{ti), 

— Ran{ai) C Vars{siaoai ■ ■ ■ ai-i) U Vi.'^ 

6 is simply-local w.r.t. a query B if 0 is simply-local w.r.t. the clause p v- B 
where p/0 is a fresh predicate symbol. 

Note that in the case of a simply-local substitution w.r.t. a query, (Jq is the 
empty substitution, since Dom{ao) C Far(p) where p is a fresh predicate symbol. 
Note also that if A, B, C — > (A, B, C)9 is an input-consuming derivation step 
using clause c = H ^ then 6\h is simply-local w.r.t. the clause H ^ and 
is simply-local w.r.t. the atom B [17]. 

Example /.d. Consider the PERMUTE_BACK program in Fig. 5 (the name has been 
chosen to distinguish it from PERMUTE to be introduced later). Assume mode 
permute(0, /), insert(0, O, /). We examine the recursive clause for insert. 



^ Note that sq is undefined. By abuse of notation, Vars{so 
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"/o permute (Xs ,Ys) ^ "/ 

"/, Ys is a permutation of the list Xs. "/, 

permute ( [X I Xs] ,Ys) ^ 
insert(Zs,X,Ys) , 
permute (Xs,Zs) . 
permute ([],[]). 



insert(Xs,X,Zs) ^ 

Zs is obtained by inserting X into Xs 

insert (Xs , X , [X I Xs] ) . 
insert ( [Y I Xs] ,X, [Y I Zs] ) ^ 

insert(Xs,X,Zs) . 



Fig. 5. PERMUTE_BACK 



The substitution a = {Y/V, Zs/[W], Xs/[], X/W} is simply-local w.r.t. it: let (Jq = 
{Y/V, Zs/[W]}, (Ti = {X/W, Xs/[]}; then Dom{ao) C {Y, Zs}, Ran{ao) C Vq where 
Vb = {V, Wj, Dom{ai) C {Xs,X}, and Ran{a\) C Vars{Zs ao). 

Based on simply-local substitutions, we now define a restricted notion of 
model. 

Definition 4.7. Let / C AtorriL- We say that J is a simply-local model of c = 
H <— i?i, . . . , B„ if for every substitution 9 simply-local w.r.t. c, 

if Bi9,...,B„0 G I then H9 G I. (2) 

/ is a simply-local model of a program P if it is a simply-local model of each 
clause of it. 

Note that a simply-local model is not necessarily a model in the classi- 
cal sense, since / is not necessarily a set of ground atoms, and the substi- 
tution in (2) is required to be simply-local. For example, given the program 
{q(l), p(X)^q(X)| with mode q(/), p(0), a model must contain the atom p(l), 
whereas a simply-local model does not necessarily contain p(l), since {X/l| is 
not simply-local w.r.t. p(X) ^ q(X). The next subsection will further clarify the 
role of simply-local models. 

Let SMp be the set of all simply moded atoms® in Atomp. It has been 
shown that the least simply-local model of P containing SMp exists and can be 
computed by a variant of the well-known Tp-operator [17]. We denote the least 
simply-local model of P containing SMp by PMp^, for partial model. 

Example J^.8. Recall Ex. 4.6. SM p consists of all atoms insert(Us, U, Fs) where 
Us,U ^ Vars{Vs). To construct PMp^, we iterate Tp^ starting from any atom 
in SMp (the resulting atoms are written on the l.h.s. below) and the fact clause 
(r.h.s.). Each line below corresponds to one iteration of Tp^. We have PMp^ = 

{ insert(Us, U, Fs), 

insert([Yi jUs], U, [Yij Fs]), insert(Xsi, Xi, [XijXsi]), 

insert([F 2 ,Fi|Us],U, [F 2 , Yij Fsj), insert([Fi jXsi], Xi, [Yi, Xi\Xsi]), (3) 

I Fs, Xsi, Xi, Yi, F 2 , arbitrary where Us, U ^ Fars(Fs)}. 

® We sometimes say “atom” for “query containing only one atom”. 
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Observe the variable occurrences of U, Us in the atoms on the l.h.s. In Ex. 5.5, 
we will see the importance of such variable occurrences. 

4.3 Declarative Characterisation 

Bossi et al. [17] define simply- acceptability, which is the notion of decrease used 
for proving input termination. 

We write p ~ g if p and q are mutually recursive predicates [2]. Abusing 
notation, we also use ~ for atoms, where p(s,t) ~ g(u, v) stands for p q. 

Definition 4.9. Let P be a simply moded program, j.j a moded generalised® 
level mapping and I a simply- local model of P containing SMp. A clause 
A*^Bi , Bn is simply acceptable by j.j and I if for every substitution 6 
simply-local w.r.t. it, 

for alH G [1, n], (Pi, . . . , Pi_i)6* G / and A ~ imply |A0| > |Pi6*|. 

The program P is simply acceptable by \.\ and I if each clause of P is simply 
acceptable by j.j and I. 

Admittedly, the proof obligations may be difficult to verify, especially in the 
cases where a small (precise) simply-local model is required. However, as our 
examples show, often it is not necessary at all to consider the model, as one can 
show the decrease for arbitrary instantiations of the clause. 

Simply-acceptability, and P-simply-acceptability to be introduced in the next 
section, are not based on ground instances of clauses, but rather on instances 
obtained by applying simply-local substitutions, which arise in input-consuming 
derivations of simply moded programs. This is in contrast to all other character- 
isations in this article, and explains why we use generalised level mappings and 
a special kind of models. 

Also note that in contrast to recurrence and other decreasing notions to be 
defined later, simply-acceptability has no proof obligation on queries (apart from 
the requirement that queries must be simply moded). Intuitively, such a proof 
obligation is made redundant by the mode conditions (simply-acceptability and 
moded level mapping) and the fact that derivations must be input-consuming. 
We also refer to Subsec. 10.1. 

Simply-acceptability characterises the class of input terminating programs. 

Theorem 4.10 ([17]). Let P and Q be a simply moded program and query. 

If P is simply acceptable by some [.[ and /, then P and Q input terminate. 
Conversely, if P and every simply moded query input terminate, then P is 
simply acceptable by some moded generalised level mapping [.[ and PMp^. 

The formulation of the theorem differs slightly from the original for reasons 
of consistency, but one can easily see that the formulations are equivalent. 



In [17], the word “generalised” is dropped, but here we prefer to emphasise that 
non-ground atoms are included in the domain. 
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permute ( [X I Xs] ,Ys) ^ 
permute (Xs , Zs) , 
insert(Zs,X,Ys) . 
permute ([],[]). 



insert (Xs , X , [X I Xs] ) . 
insert ( [Y I Xs] ,X, [Y I Zs] ) «— 

insert (Xs,X,Zs) . 



Fig. 6. PERMUTE 



Remark 4-H- The definition of input-consuming derivations is independent from 
the textual order of atoms in a query, and so the textual order is irrelevant for 
termination. Therefore, if we can prove input termination for a program and 
query, we have also proven termination for a program obtained by permuting 
the body atoms of each clause and the query in an arbitrary way. 

It would have been possible to state this remark explicitly in the above theo- 
rem, but that would have complicated the definition of simply-local substitution 
and subsequent definitions. Generally, the question of when it is necessary to 
make the permutations of body atoms explicit is discussed in [66, Sec. 5.3]. 

4.4 Examples 

Example 4- 12. The program EVEN in Fig. 4 is simply acceptable with mode 
even(/), lte((9, 1) by using the level mapping in Ex. 4.1, interpreted as moded 
generalised level mapping in the obvious way, and using any simply-local model. 
Moreover, the query even(X), lte(X, s^°°(0)) is permutation simply moded (see 
Remark 4.11). Hence EVEN and this query input terminate. 

Example 4-13. The program PERMUTE is shown in Fig. 6. Assume the mode 
permute)/, O), insert)/,/, O). Note that compared to Fig. 5, two body atoms 
have been reordered to make the program simply moded in this mode. Note also 
that permute 9^ insert. The program is readily checked to be simply acceptable, 
using the moded generalised level mapping 

|permute(As, Ys)] = |insert(As, Ys,Zs)\ = size(Xs) 

and any simply-local model. Thus the program and any simply moded query 
input terminate. It can also easily be shown that the program is not recurrent. 

Example 4-H- Figure 7 shows program 15.3 from [72]: QUICKSORT using a form 
of difference lists (we permuted two body atoms for the sake of clarity). This 
program is simply moded with mode quicksort)/, O), quicksort_dl(/, 0,1), 
partition)/, I, O, O), =<(/, I), >(/, /). 

We use the following moded generalised level mapping (positions with _ are 
irrelevant) 



|quicksort_dl(As, _, _)[ = length(Xs), 
[partition) As, _,_,_) I = length(Xs). 
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"/, quicksort (Xs , Ys) «— Ys is an ordered permutation of Xs. 

quicksort (Xs,Ys) ^ quicksort_dl(Xs,Ys, [] ) . 

quicksort_dl( [X I Xs] , Ys ,Zs) v- 
partition(Xs,X, Littles, Bigs) , 
quicksort_dl(Bigs,Ysl,Zs) . 
quicksort_dl(Littles,Ys, [X| Ysl] ) , 
quicksort_dl( [] ,Xs,Xs) . 

partitionC [X I Xs] , Y, [X I Ls] ,Bs) X =< Y, partition(Xs,Y,Ls,Bs) . 
partitionC [X I Xs] , Y,Ls , [X I Bs] ) ^X > Y, partition(Xs , Y,Ls ,Bs) . 
partitionC [] ,Y, [],[]). 



Fig. 7. QUICKSORT 



The level mapping of all other atoms can be set to 0. Concerning the model, the 
simplest solution is to use the model that expresses the dependency between the 
list lengths of the arguments of partition, i.e. I should contain all atoms of the 
form partition(S'i, X, S' 2 , •S's) where |S'i| > \S 2 \ and 1511 > l^sj. Note that this 
includes all simply moded atoms using partitition, and that this model is a 
fortiori simply-local since (2) in Def. 4.7 is true even for arbitrary 9. 

The program is then simply acceptable by |.| and / and hence input termi- 
nates for every simply moded query. 

In essence, looking at the clause before any instantiation, there is a decrease 
between the input of the clause head and the recursive body atoms ([X|Xs] is big- 
ger than both Bigs and Littles). Moreover, by the model information about the 
atom partition(Xs, X, Littles, Bigs) we know that this decrease is preserved 
as the clause becomes instantiated. 



5 Input 7^-Termination 

In this section, we consider input-consuming selection rules that are additionally 
parametrised by some instantiation property V that each selected atom must 
have. In particular, delay-safe derivations can be modelled this way. This section 
is based on very recent work [68] . 

We first give an example of a program that is not input terminating. 

Example 5.1. Consider again the PERMUTE_BACK program in Fig. 5. So the mode 
is permute(0, /), insert(0, O, /). It is immediate to check that the program 
is not input terminating: by repeatedly selecting the rightmost atom that may 
be selected, the query permute(Xs, [1]) generates an infinite input-consuming 
derivation. 

One can understand this by explaining why the program cannot be sim- 
ply acceptable. Recall Ex. 4.8. contains every atom of the form 

insert(Us,U, Fs), i.e. every simply moded atom whose predicate is insert. 
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Therefore in particular insert(Us, U, Vs) S -PATpermutejack (note that Vs is a vari- 
able). Consider the recursive clause for permute. The substitution 9 = {Ys/Vs, 
Zs/Us, X/U} is simply-local w.r.t. the clause. Therefore, for the clause to be sim- 
ply acceptable, there would have to be a moded generalised level mapping such 
that |permute([U|Xs], Vs)| > |permute(Xs, Us)|. This is a contradiction since a 
moded generalised level mapping is necessarily defined as a generalised norm of 
the second argument of permute, and Vs and Us are equivalent modulo variance. 

However, all derivations for permute(Xs, [1]) are finite if we require input- 
consuming derivations where each atom must be bounded w.r.t. an appropriate 
level mapping. 

The attentive reader may have noticed that PERMUTE_BACK falls out of the 
class of input terminating programs for a very simple reason: Due to the variable 
Ys in the input position of the clause head, it follows that an atom using permute 
can always be selected. 

Now it is tempting to think that the program misses the property of input 
termination “just narrowly”, and that there is a simple fix to obtain input termi- 
nation: replace Ys by [Y|Ys] in the above clause. This is a fallacy. The resulting 
program is still not input terminating. This is related to speculative output bind- 
ings and has first been observed by Naish [55]. 

Programs that “just narrowly” miss the property of input termination may 
also be analysed using the methods of this section. We refer to [68]. 

5.1 Operational Definition 

We now define termination for input-consuming 7^-derivations, i.e. derivations 
via an input-consuming 7^-selection rule. 

Definition 5.2. A program P and query Q input V -terminate if they univer- 
sally terminate w.r.t. the set consisting of the input-consuming 7^-selection rules. 

Of course, input termination is just a special case of input 7^-termination for 
a trivial V containing all atoms. However, in contrast to the previous section, it is 
unknown if the characterisation given here is complete. This justifies having the 
previous section on its own. Also, the previous section surveys well-established 
work while the work reported here is very recent. 

5.2 Declarative Characterisation 

Definition 5.3. Let P be a simply moded program, [.[ a moded generalised 
level mapping and I a simply-local model of P containing SMp. A clause 
Bi ,... , Bn is simply V -acceptable by [.[ and I if for every substitution 
cr simply-local w.r.t. it, for all i G [l,n], 

Bia , . . . , Bi-ia G I and A Bi and Bia G V imply |Acr| > \Bia\. (4) 

The program P is simply V -acceptable by \.\ and I if each clause of P is simply 
P-acceptable by [.[ and I. 
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The only difference to simply acceptable clauses is the condition Bia G V. 
Simply-local models capture all input-consuming derivations of a simply moded 
query, including the ones where we impose an additional condition V. Hence this 
small modification gives us a sufficient criterion for input 7^-termination. 

Theorem 5.4 ([68]). Let P and Q be a simply moded program and query. If 
P is simply 7^-acceptable by some |.| and /, then P and Q input 7^-terminate. 



5.3 Examples 

We give two examples where V is used exactly to model delay-safe selection 
rules. These programs need delay-safe selection rules to overcome the problem 
of speculative output bindings [55] . 

Example 5.5. Consider PERMUTE_BACK (Fig. 5) assuming mode permute(0, /), 
insert(0, O, I). Recall Ex. 4.8. We define the level mapping as 

|permute(Xs, Ts)| = length(Ys) 

|insert(Zs, X, ls)| = length(Ys). 

Now for all atoms insert(Zs, X, Is) G PMp^, we have | ls| > \Zs\; for the ones 
on the r.h.s. in (3) even | Ts| > \Zs\. Let V be the set of bounded atoms w.r.t. |.|. 

Now let us look at the recursive clause for permute. We verify that the second 
body atom fulfils the requirement of Def. 5.3, where I is PMp^. So we have to 
consider all simply-local substitutions a such that insert(Zs, X, Ys)<t G PMp^. 
For the atoms on the l.h.s. in (3), this means that 

fT 3 { Ys/ [T„, . . . , Fi I Fs] , Zs/ [T„, . . . , W |Us] , X/U} (n > 0) . 

Clearly, permute(Xs, Zs)cr ^ P, and hence no proof obligation arises. For the 
atoms on the r.h.s. in (3), this means that 



fT 3 {Ys/[y„, . . . , W, XliXsi], Zs/[r„, . . . , WlXsi], X/Xi} (n > 0). 



But then |permute([X|Xs], Ys)<t| > |permute(Xs, Zs)(t|. 

The other clauses are trivial to check, and so PERMUTE_BACK is simply V- 
acceptable. 



Example 5.6. The program NAIVEJIEVERSE (Fig. 3) in mode reverse(0, /), 
append) O, 0,7) is not input terminating, but it is input 7^-terminating for P 
chosen in perfect analogy to Ex. 5.5. 

In our opinion, the difference between delay-safe selection rules and (just) 
input-consuming selection rules is a fundamental one. Looking at the literature, 
the termination problem for the latter has been considered a much harder prob- 
lem than for the former [45,47,49,55]. We also refer to Subsec. 11.7. 
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r(X) <-p(X,Y), r(Y). p(X, s (X) ) <- f ail . 

r(0). p(s(X),X). 

Fig. 8. A program for which locality is crucial 



5.4 On Completeness of the Characterisation 

Our investigations so far suggest that the criterion of simply 7^-acceptability is 
not a necessary criterion, but that modifications are needed. More specifically, 
it seems that the condition (4) in Def. 5.3 must be “weakened” to something like 

Bicr, . . . , Bi-ia G / and A~ Bi and Bia G V and Aa G V imply \Aa\ > \Bia\, 

but it is not clear if this is strictly weaker. Therefore, we cannot provide a 
counterexample showing that 7^-acceptability is not a necessary criterion. 

So while the completeness issue is still work in progress, we hope that a 
modified criterion will eventually be found. It should then probably be take over 
the name simply V -acceptability, replacing our current definition. 

Another interesting topic for future work would consist of investigating the 
automatic inference of properties V for which termination of a given program 
can be established. 

6 Local Delay Termination 

In this section, we consider selection rules that are both local and delay-safe. We 
first give an example of a program that is not input 7^-terminating, for a V that 
ensures delay-safe selection rules. We shall see that the program terminates for 
all selection rules that are local in addition to being delay-safe (see Ex. 6.9). 

Example 6.1. Let P be the program in Fig. 8 in mode r(/),p(/, O). Setting 
V = {p(a;, Y) I X ground, Y arbitrary} U{r(x) | x ground}, we have the following 
infinite input-consuming 7^-derivation: 

r(0) — > p(0,Yi) , r(Yi) — > fail, r(s(0)) > 

fail, p(s(0),Yi) , r(Yi) — > fail, fail, r(s(s(0))) > . . . 

We give an intuitive explanation why P cannot be 7^-simply acceptable. Since 
fail is a simply moded atom, it turns out that for any X, we have p(A, s(A)) G 
PMp^. So for the recursive clause to be 7^-simply acceptable, we would need 
|A| > |s(A)| for all A, which is impossible since there are no infinite descending 
chains in IM. 

This example also demonstrates that the class of local delay terminating 
programs strictly includes the class of strongly terminating programs. 

The example is artificial. We will come back to this point in the conclusion. 
In any case, the assumption of local selection rules is crucial for the method for 
showing termination of this section. 
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6.1 Operational Definition 

Marchiori and Teusink [47] have considered local selection rules controlled by 
delay declarations. They define a safe delay declaration so that an atom can be 
selected only when it is bounded w.r.t. a level mapping. In order to avoid even 
having to define delay declarations, we took a shortcut by assuming delay-safe 
selection rules. This seems legitimate given that Marchiori and Teusink do not 
give the exact syntax of delay declarations either. 

Definition 6.2. A program P and query Q local delay terminate (w.r.t. j.j) if 
they universally terminate w.r.t. the set of selection rules that are both local 
and delay-safe (w.r.t. j.j). 

Unlike in the previous two sections, modes are not used explicitly in the 
definition of delay-safe selection rules. Therefore it is possible to contrive an 
example of a program and a query that input terminate (and hence a fortiori 
input 7^-terminate) but do not local delay terminate. The example is obtained 
by deliberately choosing a level mapping that does not refiect the mode of the 
query at hand. 

Example 6.3. The APPEND program and the query 

append! [] , [] ,X) , append (X, [] ,Y) 

input terminate for the mode append!/,/, O) . However, they do not local delay 
terminate w.r.t. a level mapping j.j such that |A| = 0 for every A (e.g. consider 
the RD selection rule). 

However, in Subsec. 10.2 we will see that under natural assumptions (in 
particular, the level mapping must be moded) delay-safe selection rules are also 
input-consuming. Then, input termination implies local delay termination. As is 
witnessed by Ex. 6.1, a program which local delay terminates but does not even 
input /^-terminate, this implication is strict. 

6.2 Information on Data Flow: Covers 

Delay-safe selection rules ensure that selected atoms are bounded. To ensure 
that the level mapping decreases during a derivation, we exploit additional in- 
formation provided by a model of the program. Given an atom H in a query, we 
are interested in other atoms that share variables with B, so that instantiating 
these variables makes B bounded. A set of such atoms is called a direct cover. 
The only way of making B bounded is by resolving away one of its direct covers. 
The formal definition is as follows. 

Definition 6.4. Let j.j be a level mapping, Q a clause containing a body 
atom B, and C a subset^ of Q such that B ^ C. We say that (7 is a direct 
cover for B (w.r.t. A^ Q and j.jj if there exists a substitution 6 such that B6 
is bounded w.r.t. j.j and Dom{9) C Vars{A,C). 

A direct cover is minimal if no proper subset is a direct cover. 

^ By abuse of terminology, here we identify a query with the set of atoms it contains. 
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Note that the above concept is similar to well-modedness, assuming a moded 
level mapping. In this case, for each atom, the atoms to the left of it are a direct 
cover. This generalises in the obvious way to permutation well moded queries. 

Considering an atom B, we have said that the only way of making B bounded 
is by resolving away one of B's direct covers. However, for an atom in a direct 
cover, say atom A, to be selected, A must be bounded, and the only way of 
making A bounded is by resolving away one of H’s direct covers. Iterating this 
reasoning gives rise to a kind of closure of the notion of direct cover. In the 
following definition, Fow stands for the powerset. 

Definition 6.5. Let |.| be a level mapping and A^ Q a clause. Consider the 
least set C, subset of Pow{Q x Pow{Q)), such that 

1. {B, 0) G C whenever 0 is a minimal direct cover for B ia A ^ Q\ 

2. {B, C) G C whenever B ^ C, and C = {C\, , Ck} U U . . . Dk, where 
{Cl, . . . , Ck} is a minimal direct cover of B in A ^ Q, and for i G [1, k], 
{Ci, Di) G C. 

The set Covers{A ^ Q) C Q x Pow{Q) is defined as the set obtained by deleting 
from C each element of the form {B, C) if there exists another element of C of 
the form {B, C') such that C' C C. 

We say that C is a cover for B (w.r.t. A^ Q and |.| j if {B, C) is an element 
of Covers{A ^ Q). 



6.3 Declarative Characterisation 

The following concept is used to show that programs terminate for local and 
delay-safe selection rules. We present a definition slightly different from the orig- 
inal one [47], albeit equivalent. 

Definition 6.6. Let j.j be a level mapping and I a Herbrand interpretation. A 
program P is delay-recurrent by j.j and / if / is a model of P, and for every 
clause c = A^ Bi , . . . , Bn of P, for every i G [1, n], for every cover C for Bi, 
for every substitution 6 such that c6 is ground, 

if I^C0 then \A0\ > \Bi0\. 

We believe that this notion should have better been called delay-acceptable, 
since the convention is to call decreasing notions that involve models (■■■)- 
acceptable, and the ones that do not involve models (. . . )-recurrent. 

Just as simply-acceptability, delay-recurrence imposes no proof obligation on 
queries. Such a proof obligation is made redundant by the fact that selected 
atoms must be bounded. Note that if no most recently introduced atom in a 
query is bounded, we obtain termination by deadlock. 

In order for delay-recurrence to ensure termination, it is crucial that when 
an atom is selected, its cover is resolved away completely (this allows to use the 
premise I |= C0 in Def. 6.6). This is the reason why the selection rule is assumed 
to be local. We can now state the result of this section. 
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Theorem 6.7 ([47]). Let P be a program. If P is delay-recurrent by some |.| 
and I, then for every query Q, P and Q local delay terminate. 

Remark 6.8. Remark 4.11 applies to local delay termination as well. 

6.4 Examples 

Example 6.9. Consider again the program in Fig. 8, with the level mapping and 
model 



|p(a:,2/)| = size{x) 

|r(a:)| = size(x) + 1 

I = {p(s(z),z) I z arbitrary} U {r(s"(0)) | n > 0}. 

The program is delay-recurrent by |.| and I. We check the recursive clause for 
r. Consider an arbitrary ground instance 

r(a;) ^p(a;,y), r(y). (5) 

First, we observe that / is a model of this instance. In fact, if its body is true in 
I, then X = s”+^(0) and y = s"(0) for some n > 0, and so r(a;) is true in I. 

Consider the first body atom. It has an empty cover. Since size{x) -I- 1 > 
size{x), we have a decrease as required. 

Consider now the second body atom. There is only one cover p(X, Y), so we 
must show that 

X = s”“''^(0) and y = s”(0) imply size{x) -I- 1 > size{y) + 1, 

which is evident. Hence we have shown that the clause is delay-recurrent. 

Note that for the P given in Ex. 6.1, any input-consuming P-derivation 
is delay-safe. So it is the locality property that makes the difference to that 
example. 

We now give another example, which seems even more contrived than Ex. 6.1, 
but turns out to be interesting because of the similarity to Ex. 7.1. 

Example 6.10. Consider 

p(X) ^q(Y), p(Y). 
q(0) <— fail. 

with |p(0)| = 0. 

For the sake of comparison, assume the mode p(/), q(0) and let V = {p(0)}U 
{q(X) I X arbitrary}. 

Then the program local delay terminates but does not input P-terminate for 
the query p(0). We will discuss this example further in the conclusion. 

In an article that was the predecessor of this one [60] , we gave PERMUTE_BACK 
(Fig. 5) as an example of a delay-recurrent program, but since then, it has been 
shown that this program does not require locality for termination (Ex. 5.5). 
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6.5 On Completeness of the Characterisation 

Note that delay-recurrence is a sufficient but not necessary condition for local 
delay termination. The limitation lies in the notion of cover: to make an atom 
bounded, one has to resolve one of its covers; but conversely, resolving a cover 
will not necessarily make the atom bounded. 

Example 6.11. Consider the following simple program 

z 4- p(X) , q(X) , r(X) . 

p(0) . 

q(s(X)) ^ q(X). 
r(X) . 

The program and any query z local delay terminate w.r.t. the level mapping: 

|z| = IpWI = kWI = 0 

|q(t)| = size{t). 

In fact, the only source of non-termination for a query might be an atom q(X). 
However, for any such atom selected by a delay-safe selection rule, X is a ground 
term. Hence the recursive clause in the program cannot generate an infinite 
derivation. On the other hand, it is not the case that the program is delay- 
recurrent. Consider the first clause. Since r(X) is a cover for q(X) and since every 
model of the program contains r(f) for every t, we would have to show for some 
|.|' that for every t. 

Nr>iq(t)r. 

This is impossible, since delay-recurrence on the third clause implies |q(s^ (0)) > 
k for any natural k. 

7 Left-Termination 

In this section, we consider the LD selection rule. We first give an example of a 
program that is not local delay terminating. 

Example 7.1. Consider the program 

P ^q. P- 

with query p, where |p| = 1 and |q| = 0. It terminates for the LD selection rule 
but does not local delay terminate. 

The example is artificial, and hence not a convincing motivation for studying 
the LD selection rule. We discuss this further in the conclusion, but in any case, 
there are several reasons for studying the LD selection rule in its own right. 
First, the conditions for termination are easier to formulate than for local delay 
termination. Secondly, the vast majority of works consider this rule, being the 
standard selection rule of Prolog. Finally, for the class of programs and queries 
that terminate w.r.t. the LD selection rule we are able to provide a sound and 
complete characterisation. 
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7.1 Operational Definition 

Definition 7.2. A program P and query Q left-terminate if they universally 
terminate w.r.t. the set consisting of only the LD selection rule. 

Formally comparing this class to the three previous ones is difficult. In par- 
ticular, left-termination is not necessarily stronger than input or local delay 
termination, e.g. when applied to programs written with the RD selection rule 
in mind. 

Example 7.3. Consider the program PERMUTE_BACK in Fig. 5, but this time in 
mode permute(/, O'), insert(/,/. O'). This program input terminates but does 
not left-terminate (see Ex. 4.13 and note Remark 4.11). 



Example 7.4. Consider the program in Fig. 8, where we permute two body atoms 
in the first clause to obtain 

r(X) ^r(Y) , p(X,Y) . 

By Remark 6.8 and Ex. 6.9, the program and every query local delay ter- 
minate w.r.t. the level mapping given there. Moreover, no derivation deadlocks. 
However, the program and the query r(0) do not left-terminate. 

Also, local delay termination may not imply left-termination because of the 
deadlock problem. We will comment on this in the conclusion. 

7.2 Extended Level Mappings 

Left-termination was addressed by Apt & Pedreschi [7], who introduced the 
class of acceptable logic programs. However, their characterisation encountered 
a completeness problem similar to the one highlighted for Theorem 3.3. 

Example 7.5. Figure 9 shows TRANSP, a program that terminates on a strict sub- 
set of ground queries only. In the intended meaning of the program, trans(a;, y, e) 
succeeds iff x y, i-e. if arc(a;, y) is in the transitive closure of a direct acyclic 
graph (DAG) e, which is represented as a list of arcs. It is readily checked that 
if e is a graph that contains a cycle, infinite derivations may occur. 

In the approach of [7], TRANSP cannot be reasoned about, since the same in- 
completeness problem as for recurrent programs occurs, namely that they char- 
acterise a class of programs that (left-)terminate for every ground query. 

The cause of the restricted form of completeness of Theorem 3.3 lies in the use 
of level mappings, which must specify a natural number for every ground atom — 
hence termination is forced for every ground query. A more subtle problem with 
using level mappings is that one must specify values also for uninteresting atoms, 
such as trans(a;, y, e) when e is not a DAG. The solution to both problems is to 
consider extended level mappings [61,62]. 
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"/, trans(a;,y ,e) ^ x y for a DAG e 

trans(X,Y)E) 

member (arc (X,Y) ,E) . 

trans(X,Y)E) «— 

member (arc (X, Z) ,E) , trans(Z,Y)E) 

Fig. 9. TRANSP 

Definition 7.6. An extended level mapping is a function |.| : of 

ground atoms to where IN°° = IN U {oo}. 

In particular, we define nt> m for n^m G IN°° iff n = oo or n > m. We write 
n > m iff n l> TO or n = m. 

We have oot>m for every to G IN°°. In particular oo l> oo. For (only) this 
reason, [> is not well-founded. However, this makes sense since the inclusion 
of oo in the codomain is intended to model non-termination and uninteresting 
instances of program clauses. 

7.3 Declarative Characterisation 

With the above notation we are now ready to introduce (a modified definition 
of) acceptable programs and queries. A program P is acceptable if for every 
ground instance of a clause from P, the level of the head is greater than the 
level of each atom in the body such that the body atoms to its left are true in 
a Herbrand model of the program. 

The modification w.r.t. [7] lies in the fact that the definition of an acceptable 
clause may involve clause instances where both the head and a body atom have 
level oo. Intuitively, a non-terminating derivation would start in a query of level 
oo and always use clause instances where head and recursive body atoms have 
level oo, while an acceptable (and terminating) query must have a level in IN. 

Definition 7.7. Let |.| be an extended level mapping, and I a Herbrand inter- 
pretation. A program P is acceptable by \.\ and / if / is a model of P, and for 
every A ^ Bi , . . . , G groundL(P), 

for alH G [1, n], I \= Bi, . . . , Bi-i implies |H| [> |Hi|. 

A query Q is acceptable by \.\ and I if there exists A: G IN such that for every 
Ai An G groundL(Q), 

for alH G [1, n], I \= Ai, . . . , Ai-i implies k [> |Hi|. 

Let us compare this definition to the definition of delay-recurrence (Def. 6.6). 
In the case of local and delay-safe selection rules, an atom cannot be selected 
before one of its covers is completely resolved. In the case of the LD selection 
rule, an atom cannot be selected before the atoms to its left are completely 



member (X, [X I Xs] ) . 

member (X, [Y I Xs] ) ^ 

member(X,Xs) . 
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resolved. Because of the correctness of LD resolution [2], this explains why, in 
both cases, a decrease is only required if the instance of the cover, resp. the 
instance of the atoms to the left, are in some model of the program. We also 
refer to Subsec. 10.1. 

Acceptable programs and queries precisely characterise left-termination. 

Theorem 7.8 ([7,61]). Let P be a program and Q a query. 

If P and Q are both acceptable by some |.| and I, then P and Q left-terminate. 
Conversely, if P and Q left-terminate, then there exist an extended level map- 
ping |.| and a Herbrand interpretation / such that P and Q are both acceptable 
by |.| and I. 

7.4 Examples 

Example 1.9. The program in Ex. 7.1 is trivially acceptable. Let |p| = 1 and 
|q| = 0 and / = 0. Then |p| l> |q|, and since / is a model of the program, / |= q 
implies |p| \> |p|. 

We now give an example that highlights the use of extended level mappings 
in termination proofs. Note that we do not intend this example to be contrasted 
with the three preceding termination classes. 

Example 7.10. We will show that TRANSP is acceptable. We have pointed out 
that in the intended use of the program, e is supposed to be a DAG. We define: 

I ( M _ / lsngth{e) -I- 1 -I- Card{v \ x '^e i"} if e is a DAG 
|trans(a:,y,ej| - otherwise 

|member(a;, e)| = length{e) 

I = {trans(a;, y, e) | x, y, e G Ul} U 
{member(a;, e) | a; is in the list e}. 

where Card is the set cardinality operator. It is easy to check that TRANSP is 
acceptable by |.| and I. In particular, consider a ground instance of the second 
clause: 

trans(a:, y, e) ^ member(arc(a:, z), e), trans(z, y, e). 

It is immediate to see that / is a model of it. In addition, we have the proof 
obligations: 

(i) |trans(a:, y, e)| l> |member(arc(a:, z), e)| 

(a) arc(a:, z) is in e implies |trans(a:, y, e)| l> |trans(z, y, e)|. 

The first one is easy to show since |trans(a:, y, e)| [> length{e). Considering the 
second one, we distinguish two cases. If e is not a DAG, the conclusion is imme- 
diate. Otherwise, arc(a:, z) in e implies that Card{v \ x "c} > Card{v \ z 
u}, and so: 

|trans(a:, y, e)| = length{e) + 1 + Card{v \ x '^e v} 

t> length{e) -P 1 -P Card{v \ z v} = |trans(z, y, e)|. 
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(s) system(N) ^ 

prod(Bs) , cons(Bs,N). 

(pi) prod( [s(0) IBs] )) ^ 

prod(Bs) . 

(p2) prod([s(s(0)) IBs])) «— 

prod(Bs) . 
prod( [] ) . 



(c) cons( [D I Bs] , s (N) ) «— 

cons(Bs,N), wait(D). 
cons( [] , 0) . 

(w) wait(s(D)) ^ 
wait (D) . 
wait (0) . 



Fig. 10. PRODCONS 



Finally, observe that for a DAG e, the queries trans(a;, Y, e) and trans(X, Y, e) 
are acceptable by |.| and I. The first one is intended to compute all nodes y such 
that X '^e y, while the second one computes the binary relation Therefore, 
the TRANSP program and those queries left-terminate. 

Note that this is of course also an example of a program and a query which 
left-terminate but do not strongly terminate (e.g. consider the RD selection rule). 

8 3-Termination 

So far we have considered five classes of terminating programs, making increas- 
ingly strong assumptions about the selection rule, or in other words, considering 
in each section a smaller set of selection rules. In the previous section we have 
arrived at a singleton set containing the LD selection rule. Therefore we can 
clearly not strengthen our assumptions, in the same sense as before, any further. 

We will now consider an assumption about the selection rule which is the 
dual to assuming all selection rules (Sec. 3). We introduce 3-termination of 
logic programs [63], claiming that it is an essential concept for separating the 
logic and control aspects of a program. 

Before, however, we motivate the limitations of left-termination. 

Example 8.1. The program PRODCONS in Fig. 10 abstracts a (concurrent) system 
composed of a producer and a consumer. For notational convenience, we identify 
the term s"(0) with the natural number n. Intuitively, prod is the producer 
of a non-deterministic sequence of I’s and 2’s, and cons the consumer of the 
sequence. The shared variable Bs in clause (s) acts as an unbounded buffer. The 
overall system is started by the query system(n). Note that the program is well 
moded with the obvious mode prod(O), cons(/, /), wait(/), but assuming LD 
(and hence, input-consuming) derivations does not ensure termination. The crux 
is that prod can produce a message sequence of arbitrary length. Now cons can 
only consume a message sequence of length n, but for this to ensure termination, 
atoms using cons must be eventually selected. We will see that a selection rule 
exists for which this program and the query system(n) terminate. 
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8.1 Operational Definition 

Definition 8.2. A program P and a query Q 3-terminate if there exists a non- 
empty set S of standard selection rules such that P and Q universally terminate 

w.r.t. S. 

If P and Q do not 3-terminate, then no standard selection rule can be ter- 
minating. For extensions of the standard definition of selection rule, such as 
input-consuming and delay-safe rules, this is not always true. 

Example 8.3. The simple program 

p(s(X)) ^ p(X). 
p(X) . 

with mode p(/) and query p(X) input terminates by deadlock, but does not 3- 
terminate. The same program and query local delay terminate (w.r.t. |p(t)| = 
size{t)). 

We will come back to the issue of deadlock in Subsec. 10.2. 

We observe that 3-termination coincides with universal termination w.r.t. the 
set of fair selection rules. Therefore, any fair selection rule is a terminating control 
for any program and query for which a terminating control exists. 

Theorem 8.4 ([62,63]). A program P and a query Q 3-terminate iff they 
universally terminate w.r.t. the set of fair selection rules. 

Concerning Ex. 8.1, it can be said that viewed as a concurrent system, the 
program inherently relies on fairness for termination. 



8.2 Declarative Characterisation 

Ruggieri [62,63] offers a characterisation of 3-termination using the notion of 
fair-bounded programs and queries. Just as Def. 7.7, it is based on extended 
level mappings. 

Definition 8.5. Let |.| be an extended level mapping, and / a Herbrand inter- 
pretation. A program P is fair-bounded by \.\ and / if / is a model of P such 
that for every A ^ , . . . , i?„ G groundriP)'. 

(a) I \= Bi , . . . ,Bn implies that for every i G [1, n], \A\ [> \Bi\, and 

(b) I ^ Bi ,.. . ,Bn implies that for some i G [1, n] with I Bi f\ |A| l> \Bi\. 

A query Q is fair-bounded by \.\ and I if there exists k € TN such that for 
every Ai , . . . , A„ G groundriQ): 

(a) I \= Ai, . . . , An implies that for every z G [1, n], k\> \Ai\, and 

(b) I Ai, . . . ,An implies that for some i G [1, n] with I ^ Ai A k > \Ai\. 
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Note that the hypotheses of conditions (a) and (b) are mutually exclusive. 

Let us discuss in more detail the meaning of proof obligations (a) and (b) in 
Def. 8.5. Consider a ground instance Bi , . . . , Bn of a clause. 

If the body Bi , . . . , is true in the model I, then there might exist a 
SLD-refutation for it. Condition (a) is then intended to bound the length of the 
refutation. 

If the body is not true in the model I, then it cannot have a refutation. In 
this case, termination actually means that there is an atom in the body that has 
a finitely failed SLD-tree. Condition (b ) is then intended to bound the depth of 
the finitely failed SLD-tree. As a consequence of this, the complement of I is 
necessarily included in the finite failure set of the program. 

Compared to acceptability, the model and the extended level mapping in the 
proof of fair-boundedness have to be chosen more carefully, due to more binding 
proof obligations. As we will see in Subsec. 10.2, however, the simpler proof 
obligations of recurrence and acceptability are sufficient conditions for proving 
fair-boundedness. Note also that, as in the case of acceptable programs, the 
inclusion of oo in the codomain of extended level mapping allows for excluding 
unintended atoms and non-terminating atoms from the termination analysis. In 
fact, if |A| = oo then (a, b) in Def. 8.5 are trivially satisfied. 

Fair-bounded programs and queries precisely characterise 3-termination, 
i.e. the class of logic programs and queries for which a terminating control exists. 

Theorem 8.6 ([62,63]). Let P be a program and Q a query. 

If P and Q are both fair-bounded by some |.| and I, then P and Q 3- 
terminate. 

Conversely, if P and Q 3-terminate, then there exist an extended level map- 
ping |.| and a Her brand interpretation / such that P and Q are both fair-bounded 
by |.| and I. 



8.3 Example 

Example 8.7. The PRODCONS program is fair-bounded. First, we introduce the 
list-max norm: 

lmax{f{xi,...,xn)) = 0 if / yf [ . | . ] 

lmax{[x\xs\) = max{lmax{xs), size{x)} otherwise. 

Note that for a ground list xs, lmax{xs) equals the maximum size of an element 
in xs. Then we define: 



|system(n)| 

|prod(6s)| 

|cons(6s, n)| 

|wait(t)| 



size{n) 3- 3 
length{bs) 

( size{n) 3- lmax{bs) 

\ size{n) 

size{t) 



if / 1= cons(6s, n) 
ii I cons(6s, n) 



I = {system(n) | n G Ul} U {prod(6s) | lmax{bs) < 2} U 

{cons(6s,n) | length{bs) = stze(n)}U {wait(a;) | x G Cl}- 
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Let us show the proof obligations of Def. 8.5. Those for unit clauses are trivial. 
Consider now the recursive clauses (w), (c), (pi), (p2), and (s). 

(w). I is obviously a model of (w). In addition, |wait(s(<i))| = size{d) + 1 l> 
size{d) = |wait(d)|. This implies (a, b). 

(c). Consider a ground instance cons([(i|6s], s(n)) cons(6s,n), wait(d) of (c). 
If / ^ cons(6s,n), wait(d), then length{bs) = size{n), and so 

length{[d\bs\) = length{bs) + I = size(ji) + I = size{s{n)), 

i.e. I \= cons([d|6s], s(n)). Therefore, / is a model of (c). Let us show proof 
obligations (a, b) of Def. 8.5. 

(a) Suppose that / |= cons(6s,n), wait(d). We have already shown that / |= 
cons([d|5s], s(n)). We calculate: 

|cons([(i|6s], s(n))| = size{n) + 1 + max{lmax{bs), size{d)} 

\> size{n) + lmax{bs) = |cons(6s,n)| 
|cons([d|6s], s(n))| = size{n) + \ + max{lmax{bs), size{d)} 

\> size{d) = |wait(d)|. 

These two inequalities show that (a) holds. 

(b) If / ^ cons(6s,n), wait(d), then necessarily / ^ cons(6s,n). Therefore 

|cons([d|6s], s(n))| > szze(n) + 1 

l> szze(n) = |cons(6s, n)|, 

and so we have (b). Recall that (b) states that the depth of the finitely failed 
SLD-tree must be bounded. In fact, it is the decrease of the “counter”, the 
second argument of cons, which in this case bounds the depth of the SLD- 
tree. 

(pl,p2). I is obviously a model of (pi). Moreover we have 

|prod([s(0)|6s])| = Zen 5 t/i( 6 s) -I- 1 [> length{bs) = |prod( 6 s)|, 

which implies (a) and (b). The reasoning for (p2) is analogous. 

(s). Consider a ground instance system(n) v-prod(6s), cons(5s,n) of (s). Ob- 
viously / is a model of (s). Let us show (a,b). 

(a) Suppose that / ^ prod(5s), cons(6s,n). This implies lmax{bs) < 2 and 
length{bs) = size{n). These imply: 

|system(n)| = size(ji) -|- 3 l> length{bs) = |prod( 6 s)| 

|system(n)| = sz 2 ;e(n) -I- 3 l> size(ji) + lmax{bs) = \cons{bs,n)\. 
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even(X) ^ 

X is an even 


natural nnmber. 


•/. odd(X) ^ 

"/, X is an odd natnral number. 


even(s(X)) ^ 
even(O) . 


odd(X) . 


odd(s(X)) ^ even(X) . 



Fig. 11. ODDEVEN 



(b) Suppose that I prod(5s), cons(5s,n). Intuitively, this means that the 
query prod(6s), cons(6s,n) has no refutation. We distinguish two cases. If 
I cons(6s,n) (cons(6s,n) has no refutation) then: 

|system(n)| = size{n) + 3 l> size(ji) = |cons(6s, n)|. 

If / 1= cons(6s,n) and I ^ prod(fcs) (prod(6s) has no refutation) then 
length{bs) = size{n), which implies: 

|system(n)| = size(ji) + 3 l> length{bs) = |prod(6s)|. 

To conclude the example, note that for every n G IM the query system(n) 
is fair-bounded by |.| and I, and so every fair SLD-derivation of PRODCONS and 
system(n) is finite. 

9 Bounded Nondeterminism 

In the previous section, we have made the strongest possible assumption about 
the selection rule, in that we considered programs and queries for which there 
exists a terminating control. In general, a terminating control may not exist. 
Even in this case however, all is not lost. If we can establish that a program and 
query have only finitely many successful derivations, then we can transform the 
program so that it terminates. 

Example 9.1. The program ODDEVEN in Fig. 11 defines the even and odd predi- 
cates, with the usual intuitive meaning. The query even(X), odd(X) is intended 
to check whether there is a number that is both even and odd. It is readily 
checked that ODDEVEN and the query do not 3-terminate. However, ODDEVEN and 
the query have only finitely many, namely zero, successful derivations. 



9.1 Operational Definition 

Pedreschi & Ruggieri [58] propose the notion of bounded nondeterminism to 
model programs and queries with finitely many refutations. 

Definition 9.2. A program P and query Q have bounded nondeterminism if 
for every standard selection rule s there are finitely many SLD-refutations of P 
and Q via s. 
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By the Switching Lemma [2] , each refutation via some standard selection rule 
is isomorphic to some refutation via any other standard selection rule. Therefore, 
bounded nondeterminism could have been defined by requiring finitely many 
SLD-refutations of P and Q via some standard selection rule. Also, note that, 
while bounded nondeterminism implies that there are finitely many refutations 
also for non-standard selection rules, the converse implication does not hold, in 
general (see Ex. 8.3). 

Bounded nondeterminism, although not being a notion of termination in the 
strict sense, is closely related to termination. In fact, if P and Q 3-terminate, 
then P and Q have bounded nondeterminism. Conversely, if P and Q have 
bounded nondeterminism then there exists an upper bound for the length of 
the SLD-refutations of P and Q. If the upper bound is known, then we can 
syntactically transform P and Q into an equivalent program and query that 
strongly terminate, i.e. any selection rule will be a terminating control for them. 
Note that this transformation is even interesting for programs and queries that 
3-terminate, since few existing systems adopt fair selection rules. In addition, 
even if we adopt a selection rule that ensures termination, we may apply the 
transformation to prune the SLD-tree from unsuccessful branches. 

9.2 Declarative Characterisation 

In the following, we present a declarative characterisation of programs and 
queries that have bounded nondeterminism, by introducing the class of hounded 
programs and queries. Just as Defs. 7.7 and 8.5, it is based on extended level 
mappings. 

Definition 9.3. Let |.| be an extended level mapping, and / a Herbrand inter- 
pretation. A program P is bounded by \.\ and / if / is a model of P such that for 
every A ^ Bi , . . . , G groundL(P)' 

I \= Bi , . . . ,Bn implies that for every i G [1, n], \A\ \> \Bi\. 

A query Q is hounded by \.\ and I if there exists /c G IN such that for every 
Ai . ,A„ G groundLiQ)-- 

I \= Ai,. . . ,An implies that for every i G [1, n], k [> \Ai\. 

It is straightforward to check that the definition of bounded programs is a 
simplification of Def. 8.5 of fair-bounded programs, where proof obligation (b) is 
discarded. Intuitively, the definition of boundedness only requires the decreasing 
of the extended level mapping when the body atoms are true in some model of 
the program, i.e. they might have a refutation. 

Bounded programs and queries precisely characterise the notion of bounded 
nondeterminism . 

Theorem 9.4 ([58,62]). Let P be a program and Q a query. 

If P and Q are both bounded by some | . | and I, then P and Q have bounded 
nondeterminism . 
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Conversely, if P and Q have bounded nondeterminism, then there exist an 
extended level mapping |.| and a Herbrand interpretation I such that P and Q 
are both bounded by |.| and I. 

9.3 Examples 

Example 9.5. Consider again the ODDEVEN program. It is readily checked that it 
is bounded by defining: 

|even(a;)| = |odd(x)| = size(x) 

I = {even(s^'*(0)), odd(s^'*"''^(0)) | i > 0}. 

The query even(X), odd(X) is bounded by |.| and I. In fact, since no instance 
of it is true in I, Def. 9.3 imposes no requirement. Therefore, ODDEVEN and the 
query above have bounded nondeterminism. 

Generally, for a query that has no instance in a model of the program (it 
is unsolvable), the k in Def. 9.3 can be chosen as 0. An automatic method to 
check whether a query (at a node of a SLD-tree) is unsolvable has been pro- 
posed by [19]. Of course, the example is somewhat a limit case, since one does 
not even need to run a query if it has been shown to be unsolvable. However, 
we have already mentioned that the benefits of characterising bounded nonde- 
terminism also apply to programs and queries belonging to the previously intro- 
duced classes. In addition, it is still possible to devise an example program and 
a satisfiable query that do not 3-terminate but have bounded nondeterminism. 

Example 9.6. We define the predicate all such that the query all(no, ni, Xs) 
collects in Xs the answers of a query q(m. A) for values m ranging from no to 

ni. 

all(N,N,[A]) ^ q(N,A). 

all(N,Nl, [A| As] ) ^ q(N,A), all(s(N) ,Nl,As) . 
q(Y,Y) . 7, just as an example 

The program and the query all(0, s(s(0)), As) do not 3-terminate, but they 
have only one computed answer, namely As = [0, s(0), s(s(0))j. The program and 
the query are bounded (and thus have bounded nondeterminism) by defining: 

|all(n, m, a;)| = max{size{m) — size{n), 0} 3- 1 
\q{x,y)\ = 0 

I = {all(n,TO,a;) | size{n) < size{m)} {(\{x,y) \ x,y, arbitrary}. 

10 Relations between Classes 

We have defined seven classes of programs and queries, which provide declarative 
characterisations of operational notions of universal termination and bounded 
nondeterminism. In this section we summarise the relationships between these 
classes. 
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Table 1. Comparison of characterisations 




boundedness 


yes 


no 


yes 


yes 


yes 


no 


fair-boundedness 


yes 


no 


yes 


yes 


yes 


yes 


acceptability 


yes 


no 


yes 


yes 


yes 


no 


delay-recurrence 


yes 


no 


yes 


no 


no 


no 


7^-simply-acceptability 


no 


yes 


yes 


no 


no 


no 


recurrence 


yes 


no 


no 


yes 


no 


n.a. 



10.1 Comparison of Characterisations 

We now try to provide an intuitive understanding of the technical differences 
between the characterisations of termination we have proposed. These are sum- 
marised in Table 1. Note that simply-acceptability is a special case of 7^-simply- 
acceptability that does not need to be distinguished in this context. 

The first difference concerns the question of whether a decrease is defined for 
all ground instances of a clause, or rather for instances specified in some other 
way. All characterisations except 7^-simply-acceptability require a decrease for 
all ground instances of a clause. One cannot attribute this difference to the termi- 
nation classes themselves: the first criterion for input-termination by Smaus [67] 
also required a decrease for the ground instances of a clause, just as there are 
characterisations of left-termination [14,25] based on generalised level mappings 
and hence non-ground instances of clauses. However, one can say that our char- 
acterisation of input 7^-termination inherently relies on measuring the level of 
non-ground atoms, which may change via further instantiation. Nevertheless, 
this instantiation is not arbitrary: it is controlled by the fact that derivations 
are input-consuming and the programs are simply moded. This is reflected in 
the condition that a decrease holds for all simply-local instantiations of a clause. 

The second difference concerns the question of whether a decrease is required 
for recursive body atoms only, or whether recursion plays no role. 7^-Simply- 
acceptability is the only characterisation that requires a decrease for recursive 
body atoms only. We attribute this difference essentially to the explicit use of 
modes. Broadly speaking, modes restrict the data flow of a program in a way 
that allows for termination proofs that are inherently modular. Therefore one 
does not explicitly require a decrease for non-recursive calls, but rather one 
requires that for the predicate of the non-recursive call, termination has already 
been shown (independently). To support this explanation, we refer to [32], where 
left-termination for well moded programs is shown, using well-acceptahility. Well- 
acceptability requires a decrease only for recursive body atoms. 

The third difference concerns the question of whether the method relies on 
(some kind of) models or not. It is not surprising that a method for showing 
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strong termination cannot rely on models: one cannot make any assumptions 
about certain atoms being resolved before an atom is selected. However, the first 
methods for showing termination of input-consuming derivations were also not 
based on models [16,67], and it was remarked that the principle underlying the 
use of models in proofs of left-termination cannot be easily transferred to input 
termination. By restricting to simply moded programs and defining a special 
notion of model, this was nevertheless achieved. For a clause H ^ Ai, , An, 
assuming that Ai is the selected atom, we exploited that provided that programs 
and queries are simply moded, we know that even though A\,...,Ai-i may 
not be resolved completely, Ai, ... , Ai-i9 will be in any “partial model” of the 
program. 

The fourth difference concerns the question of whether proof obligations are 
imposed on queries. Delay-recurrence and 7^-simply-acceptability are the charac- 
terisations that impose no proof obligations for queries (except that in the latter 
case, the query must be simply moded). The reason is that the restrictions on 
the selectability of an atom, which depends on the degree of instantiation, take 
the role of such a proof obligation. 

The fifth difference concerns the question of whether oo is in the codomain of 
level mappings. This is the case for acceptability, fair-boundedness and bound- 
edness. In all three cases, this allows for excluding unintended atoms and non- 
terminating atoms from the termination analysis, which is crucial for achieving 
full completeness of the characterisation. For an atom A with |H| = oo the proof 
obligations are trivially satisfied. However, we do not see any reason why some 
of the other characterisations could not also be generalised by allowing oo in the 
codomain of level mappings. 

A final difference concerns the way information on data flow (modes, models, 
covers) is used in the declarative characterisations. For recurrence this is not 
applicable. Apart from that, in all except fair-boundedness, such information 
is used only in a “positive” way, i.e. “if . . . is in the model then . . . ”. In fair- 
boundedness, it is also used in a “negative” way, namely “if . . . is not in the 
model then ...”. Intuitively, in all characterisation, except fair-boundedness, 
the relevant part of the information concerns a characterisation of atoms that 
are logical consequences of the program. In fair-boundedness, it is also relevant 
the characterisation of atoms that are not logical consequences, since for those 
atoms we must ensure finite failure. 



10.2 Ftom Strong Termination to Bounded Nondeterminism 

In this subsection, we show inclusions between the introduced classes, i.e. we 
justify each arrow in Fig. 1. Note that in that figure, we have not only given 
the numbers of the statements, but also the numbers of two kinds of examples: 
examples that demonstrate that an inclusion is strict, and “counterexamples” 
that demonstrate that an inclusion does not hold without making additional 
assumptions. 
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We first leave aside the classes involving dynamic scheduling, i.e. input 
(T^-)termination and local delay termination, since for these classes, the com- 
parison is much less clearcut. 

Looking at the four remaining classes from an operational point of view, we 
note that strong termination of a program and a query implies left-termination, 
which in turn implies 3-termination, which in turn implies bounded nondeter- 
minism. Examples 7.10, 8.1 and 9.1 show that these implications are strict. 

Since the declarative characterisations of those notions are sound and com- 
plete, the same strict inclusions hold among recurrence, acceptability, fair- 
boundedness and boundedness. This allows for reusing or simplifying termina- 
tion proofs. 

Theorem 10.1. Let P be a program and Q a query, |.| an extended level map- 
ping and / a Herbrand model of P. Each of the following statements strictly 
implies the statements below it: 

— P and Q are recurrent by |.|, 

~ P and Q are acceptable by |.| and I, 

— P and Q are fair-bounded by |.| and I, 

— P and Q are bounded by |.| and I. 

Consider now local delay termination. Obviously, it is implied by strong ter- 
mination, and this implication is strict (Ex. 6.1). However, we have observed 
with the programs and queries of Exs. 7.4 and 8.3 that local delay termination 
does not imply left-termination or 3-termination, in general. These results can be 
obtained under reasonable assumptions, which, in particular, rule out deadlock. 

The following proposition relates local delay termination with 3-termination. 

Proposition 10.2. Let P and Q be a permutation well moded program and 
query, and |.| a moded level mapping. 

If P and Q local delay terminate (w.r.t. |.|) then they 3-terminate. 

If P is delay-recurrent by | . | and some Herbrand interpretation then P and Q 
are fair-bounded by some extended level mapping and Herbrand interpretation. 

Proof. Since P and Q are permutation well moded, every query Q' in a derivation 
of P and Q is permutation well moded [66], and so by Def. 2.2, Q' contains an 
atom that is ground in its input positions and hence bounded w.r.t. j.j. Consider 
the selection rule that always selects this atom together with all program clauses. 
This selection rule is local and delay-safe, and it is a standard selection rule (since 
there is always a selected atom). Therefore, local delay termination implies 3- 
termination. 

Concerning the second claim, since fair-boundedness is a complete charac- 
terisation of 3-termination, we have the conclusion. 

The next proposition relates local delay termination with left-termination. 
In this case, programs must be well moded, not just permutation well moded. 
The proof is similar to the previous one but simpler. 
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Proposition 10.3. Let P and Q be a well moded program and query, and |.| a 
moded level mapping. 

If P and Q local delay terminate (w.r.t. |.|) then they left-terminate. 

If P is delay-recurrent by |.| and some Herbrand interpretation then P and 
Q are acceptable by some extended level mapping and Herbrand interpretation. 

Marchiori & Teusink [47] propose a program transformation such that the 
original program is delay-recurrent iff the transformed program is acceptable. 
This transformation allows us to use automated proof methods originally de- 
signed for acceptability for the purpose of showing delay-recurrence. 

Consider now input termination. As before, it is implied by strong termi- 
nation, and this implication is strict (Exs. 4.1 and 4.13). However, as observed 
in Exs. 6.3, 7.3 and 8.3, input termination does not imply local delay termina- 
tion, left-termination, or 3-termination, in general. Again, these results can be 
obtained under reasonable assumptions. 

The following proposition relates input termination to 3-termination. 

Proposition 10.4. Let P and Q be a permutation well moded program and 
query. If P and Q input terminate then they 3-terminate. 

Let P and Q be a permutation well and simply moded program and query. 
If P is simply acceptable by some j.j and I then P and Q are fair-bounded by 
some extended level mapping and Herbrand interpretation. 

Proof. The selection rule s constructed as in the proof of Prop. 10.2 is an input- 
consuming selection rule, and also a standard selection rule. Therefore, input 
termination implies universal termination w.r.t. {s} and hence 3-termination. 

Concerning the second claim, by Theorem 4.10, P and Q input terminate. 
As shown above, this implies that they 3-terminate. Since fair-boundedness is a 
complete characterisation of 3-termination, we have the conclusion. 

The next proposition gives a direct comparison between input and left- 
termination. The proof is similar to the previous one. 

Proposition 10.5. Let P and Q be a well moded program and query. If P and 
Q input terminate then they left-terminate. 

Let P and Q be a well and simply moded program and query. If P is simply 
acceptable by some j.j and I then P and Q are acceptable by some extended 
level mapping and Herbrand interpretation. 

To relate input termination to local delay termination, we introduce a notion 
that relates delay-safe derivations with input-consuming derivations, based on 
an a similar concept from [5]. 

Definition 10.6. Let P be a program and j.j a moded generalised level map- 
ping. 

We say that j.j implies matching (w.r.t. j.j) if for every atom A = p(s,t) 
bounded w.r.t. j.j and for every B = p(v, u) head of a renaming of a clause from 
P which is variable-disjoint with A, if A and B unify, then s is an instance of v. 
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Note that, in particular, |.| implies matching if every atom bounded by |.| is 
ground in its input positions. 

Proposition 10.7. Let P and Q be a permutation simply moded program and 
query, and |.| a moded generalised level mapping that implies matching. 

If P and Q input terminate then they local delay terminate (w.r.t. |.|). 

Proof. The conclusion follows by showing that any derivation of P and any per- 
mutation simply moded query Q' via a local delay-safe selection rule (w.r.t. |.|) is 
also a derivation via an input-consuming selection rule. So, let s be a local delay- 
safe selection rule and Q' a permutation simply moded query such that s selects 
atom A = p{s, t). Then by Def. 10.6, for each B = p(v, u), head of a renaming 
of a clause from P, if A and B unify, then s is an instance of v, i.e. s = v6* for 
some substitution 9 such that dom{9) C Vars(v). By [5, (Apt & Luitjes, 1995, 
Corollary 31)], this implies that the resolvent of Q' and any clause in P is again 
permutation simply moded. Moreover, by applying the unification algorithm [2], 
it is readily checked that, if A and B unify, then a = 9U {t/u0} is an mgu. Per- 
mutation simply-modedness implies that s and t are variable-disjoint. Moreover, 
s and V are variable-disjoint. This implies that Dom{a) n Tars(s) = 0, and so 
the derivation step is input-consuming. 

By repeatedly applying this argument to all queries in the SLD-derivation 
of P and Q via s, it follows that the derivation is via some input-consuming 
selection rule. 

Definition 10.6 seems to express the natural condition for level mappings that 
ensure input-consuming derivations. Note that the proposition is not straightfor- 
ward to generalise to, say, nicely moded programs, since in this case one cannot 
in general construct an mgu by matching as in the above proof. 

It remains an open question if simply-acceptability implies delay-recurrence 
under some general hypotheses. The problem with showing such a result lies in 
the fact that delay-recurrence is a sufficient but not necessary condition for local 
delay termination. 

Example 10.8. Consider again the program and the level mapping j.j of Ex. 6.11. 
We have already observed that the program and any query local delay terminate. 

In addition, given the mode {p(0), q(/), r(/)}, it is readily checked that 
the program is simply moded, and that the level mapping is moded and implies 
matching. Also, note that the program is simply acceptable by j.j and any simply- 
local model. 

However, this is not sufficient to show that the program is delay-recurrent, 
as proved in Ex. 6.11. Intuitively, the problem with showing delay-recurrence 
lies in the fact that the notion of cover does not appropriately describe the data 
flow in this program given by the modes. 

Finally, we consider input 7^-termination. Obviously, if a program and query 
input terminate, then they input 7^-terminate. Whether or not this inclusion is 
strict depends on whether 7^ is a trivial property or not. Examples 5.1 and 5.6 
demonstrate situations where it is strict. 
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There is little sense in making general comparisons between 7^-selection rules 
and the other classes — everything depends on V . However, the following gen- 
eralisation of Prop. 10.7 is particularly interesting. 

Proposition 10.9. Let P and Q be a permutation simply moded program and 
query, and |.| a moded generalised level mapping that implies matching. Let V 
be the set of atoms atoms that are bounded w.r.t. |.|. 

If P and Q input P-terminate then they local delay terminate (w.r.t. |.|). 

Proof. By the same proof as the one of Prop. 10.7, any derivation of P and 
any permutation simply moded query Q' via a local delay-safe selection rule 
(w.r.t. |.|) is also a derivation via an input-consuming selection rule. Moreover, 
by the definition of P, such a derivation is also a P-derivation. 



10.3 Prom Bounded Nondeterminism to Strong Termination 

Consider now a program P and a query Q which either do not universally ter- 
minate for a set of selection rules in question, or simply for which we (or our 
compiler) fail to prove termination. We have already mentioned that, if P and Q 
have bounded nondeterminism then there exists an upper bound for the length 
of the SLD-refutations of P and Q. If the upper bound is known, then we can 
syntactically transform P and Q into an equivalent program and query that 
strongly terminate. As shown by Pedreschi & Ruggieri [58], such an upper bound 
is related to the natural number k of Def. 9.3 of bounded queries. As in our no- 
tation for moded atoms, we use boldface letters to denote vectors of (possibly 
non-ground) terms. 

Definition 10.10. Let P be a program and Q a query both bounded by j.j and 
I, and let fc € N. We define Ter{P) as the program such that: 

— for every clause po(to) ^ Pi(ti), ■ . . ,Pn(t„) in P, with n > 0, the clause 

Po(to, s(D)) ^pi(ti,D), . . . ,p„(t„,D) 

is in Ter{P), where D is a fresh variable, 

— and, for every clause po(to) in P, the clause 

Po(to,-) ^ 



is in Ter{P). 

Also, for the query Q = pi(ti), . . . ,p„(t„), we define Ter{Q, k) as the query 
pi(ti,s'=(0)),...,p„(t„,s'=(0)) 

The transformed program relates to the original one as shown in the following 
theorem. 
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Theorem 10.11 ([58,62]). Let P be a program and Q a query both bounded 
by |.| and I, and let fc be a given natural number satisfying Def. 9.3. 

Then, for every n G N, Ter{P) and Ter(Q,n) strongly terminate. 

Moreover, there is a bijection between SLD-refutations of P and Q via a 
selection rule s and SLD-refutations of Ter{P) and Ter{Q^ k — 1) via s. 

The intuitive reading of this result is that the transformed program and query 
maintain the success semantics of the original program and query. Note that no 
assumption is made on the selection rule s, i.e. any selection rule is a terminating 
control for the transformed program and query. 

Example 10.12. Reconsider the program ODDEVEN and Q = even(X), odd(X) of 
Ex. 9.1. The transformed program Ter(ODDEVEN) is: 

events (X) , s (D) ) ^ odd(X,D). 

even(0 , _) . 

odd(s (X) , s (D) ) <— even(X,D). 
and the transformed query Ter{Q,k — 1) for fc = 3 is 

even(X,s^(0)) ,odd(X,s^(0)). 

By Theorem 10.11, the transformed program and query terminate for any selec- 
tion rule, and the semantics w.r.t. the original program is preserved modulo the 
extra argument added to each predicate. 

The transformations Ter(P) and Ter{Q,k) are of purely theoretical inter- 
est. In practice, one would implement these counters directly into the com- 
piler/interpreter. Also, the compiler/interpreter should include a module that 
infers an upper bound k automatically. Approaches to the automatic inference 
of level mappings and models are briefly recalled in the next section. Pedreschi 
& Ruggieri [58] give an example showing how the approach of Decorte et al. [29] 
could be rephrased to infer boundedness. 

11 Related Work 

Termination in logic programming (and its extensions) has been the subject of 
intense research over the last fifteen years. The survey of De Schreye & Decorte 
[23], dated 1994, distinguishes three types of approaches: the ones that express 
necessary and sufficient conditions for termination, the ones that provide decid- 
able sufficient conditions, and the ones that prove decidability or undecidability 
for subclasses of programs and queries. Under this classification, this survey pa- 
per has been mainly concerned with the first type. While we do not even try 
to survey the large amount of literature on automatic or semi-automatic ap- 
proaches [14,21,29,23,44,52,53,71], it must be observed that existing tools typi- 
cally implement conditions for checking proof obligations of the characterisations 
we surveyed. As an example, a challenging topic of the research in automatic 
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termination inference consists in finding standard forms of level mappings and 
models, so that the solution of the resulting proof obligations can be reduced to 
known problems for which efficient algorithms exist. Note that on a theoretical 
level the problem of deciding whether a program belongs to one of the classes 
studied in this article is undecidable. This was formally shown by Bezem [11] for 
recurrence, and by Ruggieri [62] for acceptability, fair-boundedness and bound- 
edness. Therefore, the conditions implemented by automatic tools are, inevitably, 
sufficient conditions. 

In the following, we recall other characterisations of the various notions of 
termination and relate them to those presented in this survey. 



11.1 Acceptability: the Modularity Issue 

A termination characterisation is modular if the proof obligations for the pro- 
gram P = PiU ... U Pn can be obtained from separate proof obligations of 
programs Pi, ..., P„. The modularity property is essential both in paper & 
pencil proofs and in automatic tools, since it allows for reasoning on termination 
of a large program by breaking it down to several small modules. 

Since non-termination can only arise from recursion, the decomposition Pi, 

. . . , Pn should partition P in such a way that all clauses defining two mutually 
recursive predicates appear in a same module Pi. Therefore, a termination char- 
acterisation is modular if the proof obligations for a clause defining a predicate 
p depend only on predicates mutually recursive with p. 

Apt & Pedreschi [8] refined acceptability to provide a partially modular 
method. The resulting notion, called semi- acceptability, requires that: for ev- 
ery A ^ Pi ,..., P„ e groundffiP), 



for alH G [1, n] : / ]= Pi , . . . , Pi_i implies 



|A| > |P,| if rel{A) ~ rel{B) 
|A| > I Pi I otherwise. 



Compared to acceptability, a strict decrease is now required for mutually recur- 
sive predicates only. Even if this simplifies proofs, it is a restricted notion of 
modularity, since changes in the level mapping of atoms defined in one module 
may make the proof obligations in higher modules invalid. 

Etalle et al. [32] proposed a refinement of acceptability {well- acceptability) for 
well moded programs and queries. The requirement of well-modedness simplifies 
proofs of acceptability. On the one hand, the decrease of the level mapping is 
now required only for mutually recursive calls, i.e. for every A <— B\ , . . . , P„ G 
ground,L{P), 



for all z G [1, n], I \= Bi ,.. . , Pi_i and rel{A) ~ rel{B) imply |A| > \Bi\. 



On the other hand, level mappings are assumed to be moded, and this leads 
to no proof obligation on queries (or better, queries are bounded as an im- 
mediate consequence). Also, it is interesting to observe that the definition of 
well-acceptability is then very close to simply-acceptability (Def. 4.9). Actually, 
well-modedness of a program and a query implies that atoms selected by the LD 
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selection rule are ground in their input positions, hence a derivation via the LD 
selection rule is input-consuming. 

De Schreye & Serebrenik [24] generalised well-acceptability to order- accepta- 
bility, by having any well-founded ordering, not necessarily K, as codomain of 
level mappings. This allows us to show the same termination results and to 
simplify termination proofs when complex level mappings may be needed. 

11.2 Non-ground Characterisations of Left-Termination 

Alternative characterisations of left-termination consider proof obligations on 
generalised level mappings and thus on possibly non-ground instances of clauses 
and queries. Let us recall the well-known approach of Decorte ct al. [25,29]. 
First, they use a non-ground notion of model. 

Definition 11.1. A generalised modeP of a program P is a set / C AtorriL such 
that for every Bi , . . . , G instriP), 

Bi , . . . , Bn G / implies A G I. 

Second, they require (generalised) level mappings to be invariant under instan- 
tiation for atoms that may appear in a derivation starting from a set of intended 
queries. This is the counterpart of acceptability of a(n atomic) query. 

Definition 11.2. For a program P and a set of queries Q, let Call{P, Q) be 
the set of atoms selected along a SLD-derivation of P and any Q G Q via the 
LD selection rule. 

A generalised level mapping [.[ is rigid if for every A G Call{P, Q) and every 
substitution 9, we have |A| = \A9\. 

Usually, abstract interpretation techniques allow us to compute a superset of 
Call{P,Q) given P and Q, while for a broad class of norms, rigidity can be 
verified syntactically [14]. 

The proof method, called rigid acceptability w.r.t. a set Q, requires that for a 
rigid level mapping [.[ and a generalised model J: for every A <— Pi , . . . , B„ G 
instriP), 

for all z G [1, n], I \= Bi . , Pi_i and rel{A) ~ rel{B) imply |A| > \Bi\. 

If those proof obligations are satisfied, then P and every A G Q left-terminate. 

This characterisation is fully modular, i.e. it does not require P to be well- 
moded as in the case of well-acceptability. However, the characterisation is not 
complete. The main problem is due to the notion of rigidity. 

Example 11.1. The query p(X) and the simple program P below left-terminate. 

p(a) ^ p(b) . 
p(b) . 



A generalised model coincides with a set of valid interargument relations in the 
terminology of [25,29]. 
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Consider now Q = {p(X)}. We have Call{P, Q) = {p(X),p(a),p(b)}. However, 
for any generalised level mapping |.|, proof obligations require |p(a)| > |p(b)|, 
which implies that |.| cannot be rigid on Call{P, Q). 

The source of the problem lies in the requirement |H| = \A0\ of Def. 11.2. 
By assuming |H| > \A9\, the example program and query above can be reasoned 
about. 

De Schreye and Serebrenik [24] have adapted this approach, i.e. the use of 
call sets, to general orderings, as opposed to level mappings. However, the aspect 
of incompleteness is pretty much the same as in the approach of Decorte et al. 
(see [24, Example 6]). 

A general solution is provided by Bossi et al. [14] consisting of: (1) generalised 
level mappings with an arbitrary well-founded ordering as the codomain that 
do not increase w.r.t. substitutions; (2) a specification {Pre, Post), with Pre, 
Post C AtorriL, which is intended to characterise call patterns (Pre) and correct 
instances (Post) of atomic queries. Call patterns provide information on the 
structure of selected atoms, while correct instances provide information on data 
flow. However, the proof obligations are not well suited for paper & pencil proofs, 
since they require to reason on the strongly connected components of a graph 
abstracting the flow of control of the program under consideration. 

11.3 Left-Termination with Respect to a Set of Queries 

Acceptability w.r.t. a set allows us to reason on a program and a set of queries, 
while acceptability seems to concentrate on a program and a single query at 
once. The benefit of acceptability w.r.t. a set consists of having just one single 
proof of termination for a set of queries rather than a set of proofs, one for each 
query in the set. 

However, we observe that in our examples on acceptability, proofs can easily 
be generalised to a set of queries. If this was not the case, the practical use of 
termination analysis would be very limited. For instance, given a level mapping 
such that lp(t)l = length{t), it is immediate to conclude that all queries p(T), 
where T is a list, are acceptable. 

Conversely, is it the case that if P and all queries in a set Q left-terminate 
then P and every Q G Q are acceptable by a same ].] and /? 

The answer is affirmative. In fact, from the proof of the Completeness Theo- 
rem 7.8 [62, Theorem 2.3.20], if P and Q left-terminate then they are acceptable 
by a level mapping j.jp and a Herbrand model Ip that only depend on P. This 
implies that every Q G Q is acceptable by J.Jp and Ip. In conclusion, acceptabil- 
ity by j.jp and Ip precisely characterises the maximal set Q such that P and Q 
left-terminate for each Q G Q. 

11.4 Permutation Terminating Programs 

A permutation of a program P (resp., query Q) is any program (query) obtained 
by reordering clause body atoms in P (atoms in Q). We say that P and Q 
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permutation terminate if for some permutation P' of P and Q' of Q, P' and Q' 
left-terminate. Observe that permutation termination is strictly weaker than left- 
termination, and strictly stronger than 3-termination (e.g. program PRODCONS 
in Fig. 10 and system(n), with n G IN, 3-terminate but do not permutation 
terminate). 

We have not included permutation termination in our formal hierarchy since 
it is trivial from a theoretical point of view to relate it to left-termination: simply 
analyse all possible permutations of the program and query for left-termination. 
Permutation termination is mainly an issue for automatic tools, since one would 
like to compute this permutation efficiently. 

Deransart & Maiuszyhski [30] presented the proof obligations of their method 
by considering a generic permutation of body atoms. However, the choice of the 
permutation is left to the user. 

The inference of an appropriate permutation has been proposed by Speirs 
et al. [71] and by Hoarau & Mesnard [39]. In [71], mode and type information 
provided by the programmer are used to reorder the body atoms. The resulting 
static termination algorithm is part of the Mercury system [70]. In contrast, 
the approach of [39] aims at inferring an as large as possible set of queries for 
which a program permutation terminate without involving the programmer in 
additional specifications. 



11.5 Transformational Approaches 

It is possible to investigate termination of logic programs by transforming them 
to some other formal system. If the transformation preserves termination, one 
can resort to the compendium of techniques of those formal systems for the 
purpose of proving termination of the original logic program. 

Baudinet [10] considered transforming logic programs into functional pro- 
grams. Termination of the transformed programs can then be studied by struc- 
tural induction. Her approach covers general logic programs, existential termina- 
tion and the effects of the Prolog cut. Also, there is a considerable body of liter- 
ature on transforming logic programs to term rewriting systems (TRSs), where 
a large set of well-founded orderings is available for reasoning about termination 
. It is very common in these transformational approaches to use modes. The 
intuitive idea is usually that the input of an atom has to rewrite into the output 
of that atom. Most of those works assume the LD selection rule [9,35,41,56]. One 
notable exception is due to Krishna Rao et al. [43], where termination is con- 
sidered w.r.t. selection rules that respect a producer-consumer relation among 
variables in clauses. Such a producer-consumer relation is formalised with an 
extension of the notion of well-modedness. 

While the transformation must be sound (if the transformed program ter- 
minates then the original one terminates as well), the converse (if the original 
program terminates then the transformed one terminates as well) is not well 
studied. One remarkable exception is the approach by Aguzzi & Modigliani [1], 
whose transformation is complete, albeit only for the limited class of input driven 
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logic programs [4]. So for this limited class, a program terminates if and only if 
the corresponding TRS terminates. 

11.6 Integer and Floating-Point Computations 

For efficiency reasons, integers and integer predicates are implemented in Prolog 
(and other logic programming languages) by means of special terms and predi- 
cates, built-in’s of the system. As an example, 3 < (2+2) is an atom containing 
the less-than predicate < and the ground arithmetic expression terms 3 and 
(2+2). As one could expect, the resolution of the atom above leads to success. 

Integer arithmetic does not require special treatment when termination does 
not depend on integer computation, such as in the definition of the partition 
predicate in Ex. 7. In contrast, in presence of integer computations, the definition 
of the level mapping might take into account integer arguments of atoms. The 
approach of Dershowitz et al. [31] deduces automatically from a given program 
a finite abstract domain for representing ranges of integer expressions involved 
in program clauses. The abstract domain serves as a basis for checking the de- 
creasing of level mappings over recursive calls. 

Serebrenik [64] shows that the definition of a level mapping when integer ar- 
guments are critical for termination may be not as simple as expected, e.g. it may 
be non-linear. He proposed and implemented a sufficient condition for partition- 
ing integers into intervals (called adornments) such that a linear level mapping 
can be defined on each of them. Even further, Serebrenik & De Schreye [65] 
extended the approach to reason on floating-point computations, i.e. in presence 
of rounding errors. 

Also, Apt et al. [6] proposed a variant of acceptability for reasoning on built- 
in predicates, including arithmetic ones, var() and ground(). Their key concept 
is a specialised semantics (called 0-semantics) and a notion of model w.r.t. such 
semantics to be used instead of Herbrand models in the definition of acceptabil- 
ity. 

11.7 Dynamic Scheduling 

The term dynamic scheduling refers to selection rules where the selection of an 
atom depends on its degree of instantiation at runtime. Dynamic scheduling can 
be implemented using delay declarations as provided by Godel [38] or SICStus 
[73], or using guards (see Subsec. 11.12). 

We believe that modes are important for understanding dynamic scheduling, 
even though some authors have not used them explicitly [45,47,49,55]. Modes 
are the basis for defining input-consuming derivations, which are a formalism 
for describing dynamic scheduling while abstracting from the technical details 
of delay declarations. We also believe that within dynamic scheduling, there 
is an important qualitative distinction between what we call (here) weak and 
strong selection rules. Weak selection rules are achieved by delay declarations 
that test for arguments being at least non-variable, and ideally correspond to 
input-consuming selection rules. Strong selection rules ensure that the depth of 
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the SLD-tree of an atom is bounded at the time of selection, and more or less 
correspond to delay-safe selection rules. 

Naish [55] considers delay declarations that would fall under weak selection 
rules. Naish has given two intuitive causes for loops: circular modes and specula- 
tive output bindings. The first cause (see Ex. 4.4) can be eliminated by requiring 
programs to be permutation nicely moded'^ . Speculative output bindings are in- 
deed a good explanation for the fact that permute(0,/) (see Ex. 5.1) does not 
input terminate. Naish then makes the additional assumption that the selection 
rule always selects the leftmost selectable atom, and proposes to put recursive 
calls last in clause bodies. Effectively, this guarantees that the recursive calls are 
ground in their input positions, which would fall under strong selection rules. 

Liittringhaus-Kappel [45] proposed a method for generating delay declara- 
tions automatically. The method finds aeceptable delay declarations, ensuring 
that the most general selectable atoms have finite SLD-trees. What is required 
however are safe delay declarations, ensuring that instances of most general se- 
lectable atoms have finite SLD-trees. A safe program is a program for which 
every acceptable delay declaration is safe. Liittringhaus-Kappel states that all 
programs he has considered are safe, but gives no hint as to how this might be 
shown in general. This work is hence not about proving termination. Sometimes 
the generated delay declarations would fall under weak selection rules, but in 
some cases, the delay declarations require an argument of an atom to be a list 
before that atom can be selected, which would fall under strong selection rules. 

Apt & Luitjes [5] made a first attempt to show termination for dynamic 
scheduling. They considered deterministic programs, i.e. programs where for 
each selectable atom (according to the delay declarations) there is at most one 
clause head unifiable with it. For such programs, the existence of one successful 
derivation implies that all derivations are finite. Such a class of programs, how- 
ever, is of limited interest. Apt & Luitjes also give conditions for the termination 
of APPEND, but these are ad-hoc and do not address the general problem. 

The work by Marchiori & Teusink [47], which we surveyed in Sec. 6, not only 
assumes strong selection rules, but in addition selection rules must be loeal. A 
limitation of their method lies in the fact that the notion of cover is just an 
approximation of the data flow in a program (see Ex. 6.11). No implementation 
of local selection rules is mentioned by the authors. We refer to the conclusion 
for further discussion. 

Martin & King [49] ensure termination by imposing a depth bound on the 
SLD-tree. This is realised by a program transformation introducing additional 
argument positions for each predicate, which are counters for the depth of the 
computation. Of course, this falls under strong selection rules. 

Naish’s proposal [55] has been formalised and refined by Smaus et al. [69]. 
The authors consider atoms that may loop when called with insufficient input. 
It is proposed to place such atoms sufficiently late; all producers of input for 
such atoms must occur textually earlier. Effectively, this is a hybrid selection 
rule where strong assumptions are made only for certain atoms. 

® A generalisation of “permutation simply moded”. 
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Concerning input termination, the first sound but incomplete characterisa- 
tion assumed well and nicely moded programs [67]. It was then found that the 
condition of well-modedness could easily be lifted [16]. By restricting to simply 
moded programs, it was possible to give a characterisation that is also complete 
[17], which is the work we survey in Sec. 4. It has been shown that under nat- 
ural conditions, input-consuming derivations can be implemented using delay 
declarations [15,17,66]. 

The recent work of [68] considers input-consuming selection rules with addi- 
tional assumptions. In one dimension, a selection rule can be parametrised by 
a property V that the selected atoms must have. This can be used to formalise 
delay-safe selection rules as we did in Sec. 5. However, the notion of 7^-derivation 
abstracts from the distinction between weak and strong selection rules, since V 
could be any instantiation property. In another dimension, a selection rule can 
be local or not (necessarily) local. These dimensions can freely be combined. 

11.8 3-Termination 

Concerning termination w.r.t. fair selection rules, i.e. 3-termination, we are aware 
only of the works of Gori [36] and McPhee [50] . Gori proposed an automatic sys- 
tem based on abstract interpretation analysis that infers 3-termination. McPhee 
proposed the notion of prioritised fair selection rules, where atoms that are 
known to terminate are selected first, with the aim of improving efficiency of 
fair selection rules. He adopts the automatic test of Lindenstrauss & Sagiv [44] 
to infer (left-)termination, but, in principle, the idea applies to any automatic 
termination inference system. 

11.9 Bounded Nondeterminism 

Sufficient (semi-) automatic methods to approximate the number of computed 
instances by means of lower and upper bounds have been studied in the context 
of cost analysis of logic programs [26] and of cardinality analysis of Prolog pro- 
grams [18]. As an example, cost analysis is exploited in the Ciao-Prolog system 
[37]. Of course, if oo is a lower bound to the number of computed instances of P 
and Q then they do not have bounded nondeterminism. Dually, if n G fV is an 
upper bound then P and Q have bounded nondeterminism. In this case, how- 
ever, we are still left with the problem of determining a depth of the SLD-tree 
that includes all the refutations. 

The idea of cutting unsuccessful SLD-derivations is common to the research 
area of loop checking (see e.g. [12]). While a run-time analysis is potentially 
able to cut more unsuccessful branches, the evaluation of a pruning condition at 
run-time, such as for loop checks, involves a considerably higher computational 
overhead than statically checking the boundedness proof obligations. 

11.10 General Programs 

General programs admit negative literals in clause bodies and in queries. In pres- 
ence of negation, there are several execution models proposed in the literature. 
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The most widely known is SLDNF-resolution, where negation is interpreted 
by the negation-as-failure rule. A declarative characterisation of strong termina- 
tion for general logic programs and queries was proposed by Apt & Bezem [3]. 
They assume safe (not to be confused with delay-safe [47]) selection rules, mean- 
ing that negative literals can be selected only if they are ground. Apt & Pedreschi 
[7] have generalised acceptability to reason on programs with negation under 
SLDNF resolution. The characterisation is sound. Also, it is complete for safe 
selection rules. 

When turning on other execution models, the class of (left-) terminating pro- 
grams and queries may differ. A declarative characterisation of left-termination 
was provided by Marchiori [46] in the context of constructive negation by extend- 
ing acceptability. Also, an elaborated notion extension of recurrence has been 
proposed in the context of SLDNFA-resolution by Verbaeten [76], and in the 
context of the EK-proof procedure by Mancarella et al. [57] . 

Finally, the modularity issue for general programs is discussed by Bossi et 
al. [13]. 

11.11 Extensions of LP: Constraint Logic Programs 

The first work on characterisations of (left-)termination in constraint logic pro- 
gramming (CLP) is due to Colussi et al. [22], who proposed a necessary and 
sufficient condition inspired by the method of Floyd for termination of flowchart 
programs [33]. Their method consists of assigning a data flow graph to a pro- 
gram, where each node is labelled with the set of constraint stores of calls that 
may reach the associated program point. The decreasing of a function on every 
cycle of the data flow graph is then a necessary and sufficient condition for left- 
termination. A drawback of the method is that the set of constraints associated 
to nodes must be specified (the approach is not automated), which means rea- 
soning operationally (as opposed to declaratively in terms of level mappings) on 
the program. 

Ruggieri [61] proposed an extension of acceptability that is sound and com- 
plete for ideal CLP languages. A CLP language is ideal if its constraint solver, 
the procedure used to test consistency of constraints, returns true on a consistent 
constraint and false on an inconsistent one. In contrast, a non-ideal constraint 
solver may return unknown if it is unable to determine (in) consistency. An ex- 
ample of non-ideal CLP language is the CLP (7^) system, for which Ruggieri 
proposes proof obligations (based on a notion of modes) in addition to accept- 
ability in order to obtain a sound characterisation of left-termination. 

Mesnard [51] provided sufficient termination conditions based on approxi- 
mation techniques and Boolean /r-calculus, with the aim of inferring a class of 
left-terminating CLP queries. Recently, the approaches of Mesnard and Ruggieri 
have been merged into a unified framework [54], for which an implementation is 
described in [52]. 

Finally, Friihwirth [34] adapted the notion of recurrent logic programs to show 
termination of constraint handling rules, a language closely related to concurrent 
constraint programming and especially designed for writing constraint solvers. 
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11.12 Extensions of LP: Programs with Guards 

The definition of input-consuming derivations has a certain resemblance with 
derivations in the parallel logic language of (Flat) Guarded Horn Clauses [75]. 
In (F)GHC, an atom and clause may be resolved only if the atom is an instance of 
the clause head, and a test (guard) on clause selectability is satisfied. Termination 
of GHC programs was studied by Krishna Rao et al. [42] by transforming them 
into TRSs. 

Pedreschi & Ruggieri [59] characterised a class of programs (with guards 
and delay declarations) and queries that have no failed derivation. For those 
programs, termination for one selection rule implies termination (with success) 
for all selection rules. This situation has been previously described as saying that 
a program does not make speculative bindings [69] . The approach by Pedreschi 
& Ruggieri is an improvement w.r.t. the latter one, since what might be called 
“shallow” failure does not count as failure. For example, the program QUICKSORT 
is considered failure-free in the approach of [59] . 



11.13 Extensions of LP: Tabled Programs 

Tabled logic programming is particularly interesting since tabling improves the 
termination behaviour of a logic program, compared to ordinary execution. 

A declarative characterisation of tabled left-termination has been given by 
Decorte et al. [28] . The method can show termination in interesting cases where 
ordinary execution does not terminate. The approach has been extended and 
automated by Verbaeten et al. [77], where a mix of tabled and ordinary SLD- 
resolution is also studied. The characterisation provided is in general sound, and 
complete under some conditions on tabled predicates. 

12 Conclusion 

In this article, we have surveyed seven different classes of terminating logic pro- 
grams and queries. For each of them, we have provided a sound declarative 
characterisation of termination, which, in five cases, was also complete. We have 
offered a unified view of those classes allowing for non-trivial formal comparisons. 
In particular, we have shown strict inclusions among the classes, establishing the 
hierarchy shown in Fig. 1. We conclude by discussing two questions: Why, in 
some cases, did we need additional assumptions to obtain a unified view? How 
significant are the classes of the hierarchy? 

To make the first question more specific: why do the inclusions between ter- 
mination for dynamic selection rules on the one hand and left-termination and 
3-termination on the other hand not simply hold without additional assump- 
tions? We have two kinds of counterexamples. 

We have counterexamples where the textual order of atoms in the clause 
bodies of a program makes the program unsuitable for the LD selection rule 
(Exs. 7.3 and 7.4). It is not pathological for a program to be written for, say, the 
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RD selection rule, but we should not be surprised about pathological (i.e. non- 
termination) behaviour when we run the program using the LD selection rule. 

Moreover, we have counterexamples where a program input terminates, or 
local delay terminates, thanks to deadlock (Ex. 8.3). Is a program that relies on 
deadlock for termination pathological? Generally, deadlock is considered an un- 
desirable situation, but it is still preferable to non-termination. Also, it should be 
noted that deadlock cannot necessarily be blamed on the program. The APPEND 
program and the query append([l|Xs], Ys, Zs) do not 3-terminate, but they input 
terminate (for the mode input(/, /, O)), and in this sense, one could argue that 
selection rules allowing for deadlock are a stronger assumption for termination 
than any standard selection rule. This is in contrast to Props. 10.2, 10.3, 10.4 
and 10.5 (where the hypotheses imply absence of deadlock). 

Concerning the second question, there is of course a general answer: this is 
a survey article, and so we surveyed those works that are commonly recognised 
as most relevant in the field of termination for various selection rules, even if 
sometimes the significance of a result is diminished by a later result. However, 
we also have a few more specific answers. 

The interest in strong termination, 3-termination and bounded nondetermin- 
ism is evident because they are cornerstones of the whole spectrum of classes. 
The interest in left-termination is motivated by the fact that the standard se- 
lection rule of Prolog is assumed. With the three classes related to dynamic 
scheduling, we have captured the important distinction between weak selection 
rules, strong selection rules, and strong and local selection rules, as explained in 
Subsec. 11.7. 

The question can also be phrased differently: for each inclusion between 
classes, how significant is it that the inclusion is strict? li A G B but B \ A 
contains only some very obscure and contrived programs, then is it worthwhile 
to study B in detail? 

The strict inclusion between input termination and input 7^-termination, 
for an appropriate V , is witnessed by Exs. 5.1 and 5.6. These programs are 
not contrived, in fact they are famous in this context [55], but they are small 
programs, and it remains to be seen if other examples can be found. 

In our opinion, the strict inclusion between local delay termination and left- 
termination demonstrated by Ex. 7.1 is insignificant. The example is artificial. 
Most of the time, the LD selection rule turns out to be simple implementation 
of a local delay-safe selection rule — no more and no less. 

Example 6.10 is very similar to Ex. 7.1 and suggests that the strict inclusion 
between input 7^-termination and local delay termination is also insignificant, 
or put differently, that the difference made by assuming local selection rules 
is insignificant. Actually, we are not aware of a realistic program where locality 
matters for termination. However, Ex. 6.1 exhibits a certain pattern that suggests 
that there could be a realistic example: consider the clause r(X) ^ p(X, Y), r(Y). 
There are two derivations for p(X,Y), one that generates a Y bigger (say, by the 
term size norm) than X but is bound to fail, and one that generates a Y smaller 
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than X and succeeds. Locality is crucial so that this failure occurs before the 
recursive call r(Y). 

Marchiori & Teusink justify the restriction of local derivations saying that 
“the termination behaviour of ‘delay until nonvar’^*^ is poorly understood”, and 
that “the class of local selection rules [. . . ] supports simple tools for proving 
termination” [47]. In the meantime, as discussed in Sections 4 and 5, both ter- 
mination for input-consuming derivations and termination for delay-safe (but 
not necessarily local) derivations are well understood. 

Can we conclude from the above that the strict inclusion between input 
7^-termination and left-termination is insignificant, and so all the research ef- 
fort currently devoted to left-termination should be redirected towards input 
7^-termination? Not quite. Left-termination is the most important notion of ter- 
mination in practice and has been studied under every conceivable aspect. One 
cannot expect that all this work will readily translate to input 7^-termination. 
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Abstract. Reasoning about termination is a key issue in logic program 
development. One classic technique for proving termination is to con- 
struct a well-founded order on goals that decreases between successive 
goals in a derivation. In practise, this is achieved with the aid of a level 
mapping that maps atoms to natural numbers. This paper examines why 
it can be difficult to base termination proofs on natural level mappings 
that directly relate to the recursive structure of the program. The no- 
tions of bounded-recurrency and bounded-acceptability are introduced 
to alleviate these problems. These concepts are equivalent to the classic 
notions of recurrency and acceptability respectively, yet provide practi- 
cal criteria for constructing termination proofs in terms of natural level 
mappings for definite logic programs. Moreover, the construction is en- 
tirely modular in that termination conditions are derived in a bottom-up 
fashion by considering, in turn, each the strongly connected components 
of the program. 



1 Introduction 

The classes of recurrent and acceptable programs are, arguably, two of the most 
influential classes of logic program that occur in the termination literature. Ac- 
ceptable programs are precisely those which, for ground input, terminate under 
the left-to-right selection rule of Prolog [2]. Programs which, for ground input, 
terminate under any selection rule are classified as being recurrent [5] . 

Whilst the notions of recurrency and acceptability provide a sound theoret- 
ical basis for studying termination, they do not provide much insight into the 
practicalities of deriving the level mappings which are needed to prove that a 
logic program is terminating or left-terminating. Instead, intuition has served as 
the guide in the development of automatic techniques. In particular, there has 
been a desire to derive natural level mappings based on the recursive structure 
of the program at hand. For example, given the program 

P([H|T]) ^ p(T). 

it is natural to define a level mapping | . | to prove termination by | p (x) | = | a; | length 
where |.| length is the list-length norm because the predicate is inductively de- 
fined over the length of its argument which is a list. Other definitions, such as 
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|p(a;)| = I a; I length + 1 and |p(a;)| = 2 1 a; [length do not possess the same natural 
correspondence with the termination behaviour of the program. 

This paper examines the reasons why termination proofs based on recurrency 
and acceptability are often difficult to obtain. The observations are not new in 
themselves [3,6,12,18] and, by way of a solution. Apt and Pedreschi [3] define al- 
ternative characterisations of terminating and left-terminating programs which 
they call semi-recurrency and semi-acceptability respectively. This paper argues 
that these concepts do not, in fact, form an ideal basis for automatic termina- 
tion analyses (though this approach has been followed by others [25,28]); some 
difficulties complicate the construction of the level mapping that arises in the 
termination proof. To alleviate these problems, this paper introduces notions of 
bounded-recurrency and bounded-acceptability for definite logic programs and 
shows that these concepts are equivalent to recurrency and acceptability respec- 
tively. These new characterisations of the two classes provide practical criteria 
for constructing termination proofs in terms of natural level mappings. The con- 
struction is entirely modular: termination conditions are derived in a bottom-up 
fashion by considering, in turn, each the strongly connected components of the 
predicate dependency graph. A bottom-up approach is more in tune with pro- 
gram specialisation, and partial deduction in particular [10,24], since the overall 
computation is unlikely to be terminating but some sub-computations probably 
will be. More exactly, it is more useful to derive sufficient termination condi- 
tions for individual predicates rather than proving that a given top-level goal 
will terminate. The notion of bounded-acceptability lends itself naturally to this 
process. Moreover, there has been much recent interest in the inference of level 
mappings [9,15,19,20,23,27] in order to fully automate termination analysis. Thus 
the desire for natural level mappings is much more than an aesthetic predilection. 

The paper is structured as follows. Section 2 introduces the concepts neces- 
sary for discussing termination, and in particular reviews the notions of recur- 
rency and acceptability. Section 3 argues that level mappings have traditionally 
been overloaded in that they address two different termination issues. Sections 4 
and 5 reviews the concepts of semi-recurrency and semi-acceptability, arguing 
that these notions also lead to artificial level mappings. Sections 6 and 7 explain 
how the concepts of bounded-recurrency and bounded-acceptability permit sim- 
pler, more natural level mappings to be used within termination proofs. Section 8 
presents the concluding discussion, reflecting on other approaches to modularity 
[6,18]. 

2 Preliminaries: the Nuts and Bolts of Termination 

2.1 Level Mappings, Norms, and Boundedness 

The fundamental idea underlying all termination proofs is to define an order 
on the goals that can occur within a derivation. Given a program P and goal 
Go, the finiteness of derivation Go,Gi,G 2 ,... is in principle straightforward 
to demonstrate: it is sufficient to construct a well-founded order < such that 
Gi+i < Gi for all i > 0. The problem is to find such an order. To simplify the 
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problem, it is convenient to define the order on abstractions of goals rather than 
on the goals themselves. Thus the order < is defined such that G' < G holds iff 
A{G') < A{G) holds where A is an abstraction function. For example, A might 
be defined to map each goal G to a multiset of natural numbers, where each atom 
in G maps to a single number in the multiset. The idea of abstracting goals by 
mapping atoms to natural numbers leads to the concept of a level mapping. 

Definition 1 (level mapping [11]). Let P be a program. A level mapping for 
P is a function |.| : Pp N from the Her brand base of P to the set of natural 
numbers N. For an atom A G Bp, \A\ denotes the level of A. □ 



Example 1. Let P be the program 

p(a, X) p(b, X). 

P(b. a). 
p(b, b). 

The function |.| : {p(a, a), p(a, b), p(b, a), p(b, b)} N defined by |p(a, a)| = 34, 
|p(a, b)| = 12, |p(b, a)| = 0 and |p(b, b)j = 27 is a level mapping for P. □ 

Since a level mapping is defined over the Herbrand base it is not defined for non- 
ground atoms. (The reader is referred to Lloyd [21] for the standard definitions of 
the Herbrand base, Herbrand interpretations, Herbrand models, etc.) The lifting 
of the mapping to non-ground atoms was proposed in [4] . 

Definition 2 (bounded atom [4]). An atom A is bounded wrt a level mapping 
|.| if |.| is bounded on the set [A] of variable free instances of A. If A is bounded 
then I [A] \ denotes the maximum that | . | takes on [A] . □ 

The importance of the notion of boundedness cannot be over stressed. Since 
goals which are ground cannot be used to compute values, they are the exception 
rather than the norm in logic programming. Thus practical termination proofs 
must be able to deal with non-ground goals and boundedness provides the basis 
for this. 



Example 2. Let P be the program and |.| the level mapping of example 1. 
The atom p(a, X) is bounded since |.| is bounded on the set [p(a, X)] = 
{p(a, a),p(a, b)}. Moreover, |[p(a, X)]| = max({|p(a, a)|, |p(a, b)|}) and in par- 
ticular |[p(a, X)]| = max({34, 12}) = 34. □ 

Level mappings are usually defined in terms of norms. Basically, a norm is 
a mapping from terms to natural numbers which provides some measure of the 
size of a term. 



Example 3. The list-length norm |.| length : Up i-^- N from the Herbrand universe 
to the natural numbers can be defined by 




1 A |^2|length If ^ — [p|^2] 

0 otherwise 



Then, for example, | [X, Y, Z] jiength = 3. 



□ 
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Example 4- The term-size norm |.|size : C/p N is defined by 

n 

. . . ,tn)| size = P 4“ ^ ^ | ti \ size 
i=l 

Thus, for example, |f(a, g(b))|size = 24-1 = 3. □ 

The next two lemmas follow easily from definition 2. 

Lemma 1. Let |.| be a level mapping and A a bounded atom. Then for every 
substitution 9, the atom 416* is also bounded and moreover |[^]| > |[4l6*]|. □ 

Proof. Recall that [^] = {A(f> | 0 is a grounding substitution for 41} . Then [^] A 
[410],SO |[4l]| > |[410]|. □ 



Lemma 2. Let iL be a bounded atom, B an atom and |.| a level mapping. If 
for every grounding substitution 9 for H and B, \H9\ > \B9\, then B is also 
bounded and moreover |[iL]| > |[i?]|. □ 

Proof. Recall that [B] = {B9 | 6* is a grounding substitution for B}. But |iL6*| > 
|i?6*| for every grounding substitution 9 for H and B, so |.| is bounded on [B]. 
Let 9 be any grounding substitution for H and B such that |R6*| = |[i3]|. Then, 
by lemma 1, |[L 6 ]| > \[H9]\ = \H9\ > \B9\ = |[B]|. □ 

2.2 Recurrency 

In [4,5], level mappings are used to define a class of terminating programs. 

Definition 3 (recurrency [4,5]). Let P be a definite logic program and j.j a 
level mapping for P. A clause H <— B\, ... ^ is recurrent (wrt j.j) iff for every 
grounding substitution 9 and for all z G [l,n] it follows that \H9\ > \Bi9\. P is 
recurrent (wrt j.j) iff every clause in P is recurrent (wrt j.j). □ 

Henceforth all logic programs are assumed to be definite, that is, each clause 
contains precisely one atom in its consequent (its head). 

Example 5. Consider the append program below 

appi append([], X, X). 

app 2 append([U|X], Y, [UjZ]) ^ append(X, Y, Z). 

The clause app 2 is recurrent wrt to the level mapping |append(ti, 62 , fs)]! = 
| 6 i [length since for every grounding substitution 9 for app 2 , 

|append([U|X]. Y, [U|Z])0|i = |[U|X]0|iength 

= 1 4- |X0|iength 
^ I I length 

= jappend(X, Y, Z)6*|i 
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Similarly, it can be shown that the program is recurrent wrt |.|i for all i G [2,4] 
where |.| 2 , l-ls and |.|4 are defined by 

|append(ti,t2,f3)|2 = 3|ti|iength + 1 
|append(ti,t2,f3)|3 = jisliength 
|append(ti,t 2 ,f 3)|4 = min(|ti [length, jisliength) 

Moreover, the clause appi is trivially recurrent wrt to any level mapping. □ 

Bezem formalised the concept of termination relating it to recurrency. 

Definition 4 (termination [4]). Let P be a logic program and G a goal. Then 
G is terminating wrt P iff every SLD-derivation for P U {G} is finite. P is 
terminating iff every variable-free goal is terminating wrt P. □ 

Theorem 1 (recurrency [4]). Every recurrent program is terminating. 

The same result was also obtained independently by Cavedon [11] in the more 
general context of recurrent programs with negation (called locally w-hierarchical 
programs in [11] and later renamed acyclic programs in [1]). The proof in [4] relies 
on the following definition. 

Definition 5 (bounded goal [4]). A goal G Ai, . . . , is bounded wrt a 
level mapping j . j iff every Ai is bounded wrt ] . ] . If G is bounded then j [G] ] denotes 
the finite multiset of natural numbers {] ] [Ai] ],...,] [A„] j ]} . □ 

The proof of theorem 1 applies the abstraction function A= ] [.] ] and as a result a 
well-founded order < is defined over the set of bounded goals by taking G' < G iff 
|[G']j <mui |[G]j where <mui is the multiset ordering over the natural numbers. 
Recall that this ordering is defined by si <mui S 2 iff there exists ni, . . . , rim G si 
and n G S 2 such that si = (s 2 /{|n[}) U {jni, . . . ,rim[} and rii < n for all i G [l,m] 
[26] . The proof is completed by showing for every SLD-resolvent G' of a bounded 
goal G, that G' is bounded and G' < G. In fact, this proof suggests a stronger 
corollary (bounded goals are not necessarily variable- free, that is, ground). 

Corollary 1 (recurrency [4]). Let P be a logic program, G a goal and j.j a level 
mapping. If P is recurrent wrt j.j and G is bounded wrt j.j then G is terminating 
wrt P. 

Example 6. Reconsider append and the level mappings of example 5. Then 

^ append([U, V, W], Y, Z) is bounded wrt [.[i, I.I 2 and I.I 4 , 

^ append(X, Y, [U, V, W]) is bounded wrt j.js and I.I 4 

Hence these goals are terminating wrt append. Also, for a goal G observe that 
G is bounded wrt j.ji iff G is bounded wrt j.] 2 . Moreover, G is bounded wrt 
j .]4 if {not ijf) G is bounded wrt j.ji or G is bounded wrt j.js. Thus by proving 
recurrency of append wrt I.I 4 a larger class of goals can be proven terminating 
than by proving recurrency wrt j.ji, j.j 2 or j.ja. This illustrates that the choice of 
the level mapping effects the set of goals which can be shown to be terminating. 
□ 
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As a final remark, Bezem also proved the converse of theorem 1. 

Theorem 2 (recurrency [4]). A logic program is recurrent iff it is terminating. 

2.3 Acceptability 

The notion of recurrency is a theoretical one and is not of much use in proving 
termination of Prolog programs. Many Prolog programs only terminate under a 
left-to-right selection rule. This observation led Apt and Pedreschi [3] to refine 
the notion of termination as follows. 

Definition 6 (left-termination [3]). Let P be a logic program and G a goal. 
Then G is left-terminating wrt P iff every LD-derivation for P U {G} is finite. 
P is left-terminating iff every variable-free goal is left-terminating wrt P. □ 



Example 7. Consider the permute program below 
permi permute([], []). 

perm2 permute([H|T], [A|P]) ^ delete(A, [H|T], L), permute(L, P). 

deh delete(X, [X|Y], Y). 

deh delete(X, [Y|Z], [Y|W]) delete(X, Z, W). 

The goal ^ permute([l], [1]) is terminating wrt permute and as a consequence 
is left-terminating also. The goal ^ permute([l, 2], [1, 2]) is left-terminating 
but not terminating, since there exists a computation rule which results in the 
following infinite derivation 

^ permute([l,2], [1,2]), 

^ delete(l, [1, 2], L), permute(L, [2]), 

^ delete(l, [1, 2], L), delete(2, L, L'), permute(L', []), 

^ delete(l, [1, 2], L), delete(2, Z, W), permute(L', []), 

^ delete(l, [1, 2], L), delete(2, Z', W'), permute(L', []), 



By theorem 2, it follows that the program is not recurrent. However, the program 
can be proven to be left-terminating. □ 

The class of recurrent programs was extended in [3] to the class of acceptable 
programs in order to provide a theoretical basis for proving termination of left- 
terminating programs. 

Definition 7 (acceptability [3]). Let j.j be a level mapping and I an interpre- 
tation for a logic program P. A clause c : P ^ Pi, . . . , P„ is acceptable wrt j.j 
and I iff 

1. / is a model for c and 

2. for all f G Emd for every grounding substitution 9 for c such that 

I 1= {Pi, . . . , Bi-i}9 it follows that |P6*| > |Pi6*|. 
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P is acceptable (wrt |.| and I) iff every clause in P is acceptable 
(wrt |.| and I). □ 

Analogous results to those for recurrent programs (theorem 1, corollary 1 and 
theorem 2) have been proven for acceptable programs. The abstraction function 
used, however, is rather more complicated than that which is applied in the proof 
of recurrency. First, observe that if a goal G =<— Ai , . . . , terminates under 
a left-to-right computation rule, then each atom Ai is not necessarily bounded, 
but should be once the atoms to its left have been resolved. This idea forms the 
basis of the following definitions. 

Definition 8 (maximum function). The maximum function max : p(N) N U 
{oo} is defined as 



max S = 



0 if 5 = 0 

n else if S is finite and n is the maximum of S 
oo otherwise 



Then max 5 oo iff the set S is finite. 



□ 



Definition 9 (left-bounded goal [3]). Let |.| be a level mapping, I an interpre- 
tation and G Ai, . . . , A„ a goal. Then G is left-bounded wrt |.| and I iff the 
set 

6* is a grounding substitution for G 1 

is finite for each z G [l,n]. If G is left-bounded wrt |.| and / then |[G]/| denotes 
the finite multiset {| max | [G]}|, . . . , max |[G]”| |}. □ 

Note that the term left-bounded is introduced here to avoid confusion with 
definition 5. 

Using the abstraction function A= |[.]/| allows one to prove that for a goal 
G which is left-bounded wrt |.|, any SLD-resolvent G' of G is left-bounded and 
furthermore |[G']/| <mui |[G]/|. The result is the analogue of corollary 1. 

Corollary 2 (acceptability [3]). Let P be a logic program, G a goal, |.| a level 
mapping and I an interpretation for P. If P is acceptable wrt |.| and / and G is 
left-bounded wrt |.| and I then G is left-terminating wrt P. □ 

Sufficient and necessary conditions for left-termination are characterised by 
the following theorem. 

Theorem 3 (acceptability). A logic program is acceptable iff it is left- 
terminating. 
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Example 8. Considering the permute program again, let |.| be the level mapping 
defined by 

|permute(ti,t2)| = |fi|iength + l 

|delete(ti,t2,f3)| = if2|length 

and / be the interpretation 

{delete(ti , ^2 j ^3) I |^2|length — 1^3 llength 1} C 

I perm Ute(ti , ^2 ) | |^l|length — |^2|length} 

Now / is a model for the program and, in particular, for the clause perm 2 , and 
for every grounding substitution 9 for perm^, 

|permute([H|T], [A|P])0| = |[H|T]0|ie„gth + 1 
> |[H|T]0|ie„gth 
= jdelete(A, [H|T], L)6»| 

and for every grounding substitution 9 for perm 2 with I |= delete(a, [H|T], L)0, 
|permute([H|T], [A|P])0| = |[H|T]0|ie„gth + 1 

= (|L0|length + 1) + 1 

^ I I length “t” 1 

= jpermute(L, P)6*| 

Hence perm 2 is acceptable wrt |.| and I. The clauses perm^ and deh are trivially 
acceptable wrt |.| and I since / is a model for them, while the clause del 2 can 
easily be shown to be acceptable wrt |.| and / in the same way as for perm 2 - 
This proves the program permute is left-terminating. □ 



3 The Recurrent Problem 

The main problem with recurrency, as noted by [3] and [12], is that it does not 
intuitively relate to recursion, the principal cause of non-termination in a logic 
program. Definition 3 requires that, for every ground instance of a clause, the 
level of its head atom is greater than the level of every body atom irrespective 
of the recursive relation between the two. There is a temptation to address this 
issue by using a modified definition of recurrency which only requires a decrease 
for mutually recursive body atoms. The following example, from [12], shows that 
this revision, by itself, is too weak to prove termination. 

Example 9. Using the weaker form of recurrency suggested above, the following 
program would be classed as recurrent. 

p([H|Tj) ^ append(X, Y, Y). p(T). 

append([U|X], Y, [U|Z]) append(X, Y, Z). 
append([], X, X). 
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Using the left-to-right computation rule and the top-down search rule, how- 
ever, the goal ^ p([l,2]) admits an infinite computation. Of course, the clause 
defining the predicate p should not be classified as recurrent. The reason is that, 
while append is truly recurrent, only bounded goals are guaranteed to terminate 
and the predicate p contains an unbounded call to append. □ 

This example shows that the level mapping decrease between the head and 
the non-recursive atoms of a clause implied by definition 3, is required to ensure 
that all subcomputations are initiated from a bounded goal. Enforcing bound- 
edness in this way, however, complicates the derivation of level mappings. The 
following example, illustrating this, also comes from [12]. 

Example 10. Consider the following program 

Pi P(D)- 

p([H|T])^q([H|T]), p(T). 

91 q(D)- 

<72 q([H|T]) ^ q(T). 

It is clear that this program is terminating for any goal ^ p(x) where x is a rigid 
list, that is, |a:|iength = |ai6*|iength for every grounding substitution 6. To construct 
an automatic proof of termination one would like to use the level mapping j.j 
defined by 

|p(x)| = jxjlength |q(^)l ~ l^^llength 

The problem is that the clause p 2 is not recurrent wrt this level mapping since 
it is not the case that |p([H|T])0| > |q([H|T])0| for all grounding substitutions 
6. For the inequality to hold, an unnatural offset must be included in the level 
mapping definition by taking for example |p(a:)| = | a; [length + 1- D 

These examples show that the strict decrease in level between the head and body 
atoms of a recurrent clause is required for two distinct purposes: 

1. To ensure that the levels of mutually recursive calls are strictly decreasing. 

2. To ensure that subcomputations are initiated from a bounded goal. 

4 Semi-recurrency 

Apt and Pedreschi observed that, for termination, while it is necessary for the 
level mapping to decrease between the head of a clause and each mutually re- 
cursive body atom, a strict decrease is not required for the non-recursive body 
atoms. To distinguish between recursive and non-recursive body atoms the no- 
tion of predicate dependency is introduced. 
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Definition 10 (predicate dependency). Let p,q G II where II is the set of 
predicate symbols in a logic program P. Then p directly depends on q iff 

p{ti,...,tnp) ^ Bi,. . . ,B„ G P 

and Bi = q{si , . . . , s„^) for some i G [1, n]. The depends on relation, denoted □, 
is defined as the reflexive, transitive closure of the directly depends on relation. 
If p □ <7 and 3 P then p and q are mutually dependent and this is denoted by 
p q. Furthermore, let p Zl g iff p □ g and p ^ q and finally let p IZ g iff g Zl p. 
□ 



Apt and Pedreschi then introduced the notion of semi-recurrency to exploit 
the observation that a strict decrease in level in only required for the mutually 
recursive body atoms. In what follows rel{A) denotes the predicate symbol of 
the atom A. 

Definition 11 (semi-recurrency [3]). Let P be a logic program and |.| a level 
mapping for P. A clause H <— Pi, . . . , B„ is semi-recurrent (wrt |.|) iff for every 
grounding substitution 0 and for alH G [l,n] it follows that 

1. \H0\ > \Bi0\ if re;(P) ~ rel{Bi), 

2. \H0\ + 1 > \B,0\ if rel{H) rel{Bi). 

P is semi-recurrent (wrt |.|) if every clause in P is semi-recurrent (wrt |.|). □ 

Whilst this definition now admits a simple termination proof of example 10 
using the level mapping of that example, it is not hard to construct examples 
where it is inadequate. 

Example 11. Consider the following program 

Cl P(D). 

C2 p([H|T])^q([H,H|T]),p(T). 



C3 q(D). 

C4 q([H|T])^q(T). 

To prove that the above program is semi-recurrent requires the following 
unnatural level mapping: |p(a;)| = |a:|iength + 1 and q(a;)| = |a;|iength- □ 

It seems that very little has actually been gained from this revised definition 
of recurrency which still insists that there is not an increase from the level of 
the head to the level of all body atoms. In fact, it does not matter if the level 
of a non-recursive atom is greater than the level of the head provided that such 
an atom is bounded whenever it is selected. 

To be fair, the notion of semi-recurrency was introduced to facilitate modular 
termination proofs and does indeed, in some cases, allow proofs to be based on 
simpler level mappings than those used in proofs of recurrency. In the above 
example, however, this is not the case. 
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Example 12. Reconsider the program of example 11. According to the method- 
ology of [3] a modular termination proof can be constructed in a bottom-up 
fashion on the recursive cliques of the predicate dependency graph. First q is 
proven to be (semi) recurrent wrt |.|g defined by 

1 * 1 ( 2 ^) 1 9 “ 1 2^ I length 

Second, p is proven to be (semi) recurrent wrt |.|p defined by 

|p(a;)|p = |a:|iength l^(2i)|p = 0 

The final step in the construction requires the derivation of a level mapping |.|' 
such that 

\p{[ti\t 2 ])\' > H[h,h\t 2 ])U and |p([ti|t 2 ])r > |p(^ 2 )r 

for all ground terms ti and t 2 - Providing the level mapping |.|' exists, theorem 
4.9 of [3] can be used to conclude that the program is semi-recurrent and hence 
terminating. In terms of automation, this existence proof is achieved through 
defining |.|' so that the above inequalities are satisfied. However, the most likely 
choice of a definition for |.|' is 

19(2") I — 1 21 1 length 4“ 1 

which, of course, this is no easier to derive than the original mapping |.| of 
example 11. □ 

What is most conspicuous about the definition of semi-recurrency, is that the 
difference in levels between a non-recursive body atom and the head atom of a 
clause is limited to be at most zero, whereas it could be arbitrarily large, though 
still finite. Indeed, a simple termination proof for the program of example 11 can 
be obtained using a more natural level mapping if condition 2 of definition 11 
is replaced by \H9\ + k > \Bi6\ if rel{H) rel{Bi), where k is some large 
constant. It is easy to prove that this revised definition of semi-recurrency is 
equivalent to recurrency. In addition, theorems 4.6, 4.8 and 4.9 of [3], which 
are used for constructing modular termination proofs, all still hold with this 
alternative definition. 

Note that the problem with the termination proofs above arises because 
the atoms in the body of a clause contain extra function symbols which raise 
the levels of those atoms to the level of the head. Since it is fairly unlikely 
that such a body atom will contain, say, a million function symbols or more, 
by taking k = 1000000 the vast majority of recurrent programs which occur 
in practise could be proven terminating by focusing solely on their recursive 
structure and employing the appropriately weakened forms of the theorems of 
Apt and Pedreschi. 

5 Semi-acceptability 

Similar remarks to those of section 3 can be made about the definition of accept- 
ability. The notion of semi-acceptability was introduced as an analogous concept 
to semi-recurrency for left-terminating programs. 
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Definition 12 (semi-acceptability [3]). Let |.| be a level mapping and I an 
interpretation for a logic program P. A clause c : H ^ is semi- 

acceptable wrt |.| and / iff 

1. / is a model for c and 

2. for all i €. [l,n] and for every grounding substitution 9 for c such that 
/ \= {Bi, . . . , Bi-i}9 it follows that 

(a) \H0\ > \Bi9\ if rel{H) ~ rel{Bi), 

(b) \H9\ -I- 1 > \Bi9\ if rel{H) rel{Bi). 

P is semi-acceptable (wrt |.| and I) iff every clause in P is semi-acceptable (wrt 
|.| and I). □ 

Not surprisingly, termination proofs based on semi-acceptability suffer from 
similar problems to those encountered in examples 11 and 12. The definition 
could be adjusted in the manner prescribed above for semi-recurrency, but the 
result is not as satisfactory as the following example shows. 

Example 13. Consider the following program 

doubleSquare(0, []). 
doubleSquare(s(X), [D|Ds]) <— 

square(X, 0, Y), doublePlus(Y, 0, D), doubleSquare(X, Ds). 

square(0, Y, Y). 
square(s(X), Acc, Y) ^ 

doublePlus(X, s(Acc), Accl), square(X, Accl, Y). 

doublePlus(0,X,X). 
doublePlus(s(X), Y, s(s(Z))) ^ 
doublePlus(X,Y,Z). 

The function symbol s is interpreted as the successor function. Let |0|s = 0 and 
|s(a;)|s = 1-1- |a;|s and I be the interpretation 

{doublePlus(ti,t2,f3) I \t 3 \s = 2|ti|s -P |t2|s} U 
{square(ti,t2,f3) I \t 3 \s = |^i|s + |i2|s} U 

{doubleSquare(t, [h,t 2 , ... ,0]) j \U\s = 2{\t\s - f)^} 

Thus, for example, the goal ^ doubleSquare(s(s(s(0))), L) will succeed with 
L = [s(s(s(s(s(s(s(s(0)))))))),s(s(0)),0]. Observe that / is a model for the pro- 
gram. Now let the level mapping |.| be defined by 

|doubleSquare(a;, y)| = |square(a;, y, z)| = |doublePlus(a;, y, z)| = |a;|s 

The predicates square and doublePlus are both semi-recurrent (and hence semi- 
acceptable) wrt |.| and I. Now consider doubleSquare and in particular the in- 
equality |doubleSquare(s(X), [D|Ds])0| + k > |doublePlus(Y, 0, D)6*| where 9 is any 
grounding substitution. Since I \= square(X, 0, Y)0 it follows that |Y0|s = |X0|^, 
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hence there exists no k for which l+|X6*|s + fc > |Y0|s always holds. There- 
fore the inequality |doubleSquare(s(X), [D|Ds])6*| -I- A: = 1 -I- |X0|s + k > |Y6*|s = 
|doublePlus(Y, 0, D)6*| cannot always hold. Hence the predicate doubleSquare is 
not semi-acceptable wrt |.| and / under the revised definition suggested above 
even though the level mapping is natural. 

It is easy to prove semi-acceptability of the program, however, wrt the level 
mapping |.|' where |.|' is defined exactly as for |.| except that 

|doubleSquare(a;, y)|' = |a;|^ 

Note that a goal is bounded wrt |.| iff it is bounded wrt |.|' and all such goals 
are left-terminating. It seems reasonable then to base a proof of termination on 
the former level mapping since it more closely relates to the recursion and as 
a result is easier to derive automatically. However, no automatic termination 
analysis has yet been devised which can manipulate quadratic level mappings 
such as |.|g. □ 

Observe that the k above acts as an upper bound on the difference between the 
level of any body atom and the level of the head atom. Of course, this ad hoc 
approach falls down when there is no upper bound as in example 13. 

In summary, although semi-recurrency and semi-acceptability are more flex- 
ible notions than their predecessors, they still enforce a dependence between the 
level of a head atom and the levels of non-recursive body atoms. This depen- 
dence is counter intuitive and forces one to use artificial level mappings to obtain 
termination proofs. 



6 Bounded-Recurrency 

Recall from section 3 that there are two conditions which must be fulfilled to 
ensure that a program is terminating. 

1. The levels of mutually recursive calls are strictly decreasing. 

2. All subcomputations are initiated from a bounded goal. 

It is possible to define what constitutes a terminating program directly from 
these two requirements. 

Definition 13 (bounded-recurrency). Let |.| be a level mapping for a logic pro- 
gram P. A clause c : H ^ Bi, . . . , Bn is bounded-recurrent (wrt |.|) iff for every 
substitution 9 for c such that H6 is bounded and for all i G [1, n] it follows that 

1. Bi9 is bounded and 

2. \[H9]\ > whenever rel{H) ~ rel{Bi). 



P is bounded-recurrent (wrt | . | ) iff every clause in P is bounded-recurrent (wrt 

|.|). □ 
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Observe that no decrease is enforced between the level of the head of a clause 
and the levels of the non-recursive body atoms. All that is required is that each 
atom is bounded whenever the head is bounded. While this is more intuitively 
appealing, observe that boundedness of non-recursive atoms still influences the 
definition of the level mapping in a non-modular way. 

Example 14- Consider the following program for Curry’s type assignment taken 
from [3]. 

typi type(E, var(X), T) ^ in(E, X, T). 

typ 2 type(E, apply(M, N), T) ^ type(E, M, arrow(S, T)), type(E, N, S). 
lyPs typs(E, lambda(X, M), arrow(S, T)) type([(X, S) | E], M, T). 

im in([(X, T) I E], X. T). 

m 2 in([(Y, S) | E], X, T) <- X yf Y, in(E, X, T). 

One may observe that the predicate in is inductively defined over the length 
of its first argument which is a list. The predicate type is inductively defined on 
the size of its second argument which is a A-term. As a result, one would hope 
to base a termination proof on the level mapping | . | defined by 

\H^,y,z)\ = |a:|iength \type{x, y,z)\ = |y|size 

The problem, of course, is that any call type(E, var(X), t) which is bounded wrt 
|.| can give rise to a call in(E, X, T) which is not bounded wrt |.|. Clearly this can 
lead to non-termination. Definition 13, therefore, insists that for the clause typi 
the body atom in(E, X, T) is bounded whenever the head is. Unfortunately this 
entails that the level mapping must now be modified to take the first argument 
of type into account. This in turn leads to problems with the clause typ^, since 
the first argument is increasing in the recursive call. Eventually, one arrives at 
a level mapping definition such as 

|in(x,y, z)|i — l^llength |tyP^(^j 1/ 5 1 1 — l^llength T 2|y |gj 2 e 

which bears no immediate relation to the program structure. As a result such a 
mapping is likely to be difficult to derive automatically. □ 

Clearly there is an interdependence between ensuring non-recursive atoms are 
bounded wrt | . | and ensuring that the levels of recursive calls are decreasing wrt 
|.|. This plainly arises out of the use of the one level mapping. It seems therefore 
that the obvious way to break the dependence is to use two level mappings. One 
holds the responsibility for ensuring the recursive decrease in levels, while the 
other assures that non-recursive atoms are bounded. This idea is captured in the 
following definition. 

Definition 14 (bounded-recurrency). Let |.|i and |.|2 be level mappings for a 
logic program P. A clause c : H ^ is bounded-recurrent (wrt |.|i 

and 1 . 1 2 ) iff for every substitution 9 for c such that H6 is bounded wrt | . 1 1 and 
|.|2 and for all i G [l,n] it follows that 




446 Jonathan C. Martin and Andy King 



1. Bi9 is bounded wrt |.|i and |.|2 and 

2. |[i?6*]|i > |[i?i6*]|i whenever rel{H) ~ rel{Bi). 

P is bounded-recurrent (wrt |.|i and |.| 2 ) iff every clause in P is bounded- 
recurrent (wrt |.|i and |.| 2 )- D 

It is informally understood that a goal G is bounded wrt |.|i and |.|2 iff G 
is bounded wrt |.|i and G is bounded wrt |.| 2 - Note that, when the two level 
mappings coincide, that is when |.|i = |.| 2 , then definition 14 is equivalent to 
definition 13. 

Example 15. Returning to the program of example 14, recall that the stumbling 
block in the derivation of a natural level mapping arose because any call type(E, 
var(X), T) which is bounded wrt |.| can give rise to a call in(E, X, T) which is not 
bounded wrt |.|. At this point, one intuitively reasons that if the first argument 
of a call to type is a rigid list then the first argument of all subsequent calls to 
type will also be a rigid list. So define a second level mapping |.|' by 

\Mx,y,z)\' = |a;|iength |type(a;,y, z)|' = |a;|iength 

The program is bounded-recurrent wrt |.| and |.|b Indeed, any call to type or 
in which is bounded wrt |.| and |.|' only gives rise to calls which are bounded 
wrt |.| and |.|'. Combine this with the fact that recursive calls are decreasing 
wrt |.| and termination can be proven in a very intuitive manner. Furthermore, 
the level mappings |.| and |.|' follow directly from the structure of the program, 
facilitating their automatic derivation. □ 

Lemma 3 and corollary 3 below establish that bounded-recurrent programs 
are indeed terminating. Proof of this relies on orderings which not only take into 
account the levels of atoms but also their relation to each other in the predicate 
dependency graph. For a level mapping |.| and goal G Ai,. . . ,A„, if G is 
bounded wrt |.| then let |[G]| denote the finite multiset of pairs {|(re^(Ai), |[4li]|), 
. . . , {rel{An), |[A„]|)|}. Let A be the lexicographical ordering on 7T([l) x N(<) 
and let ^mui be the multiset ordering induced from Observe that is 

well founded. 

Lemma 3. Let |.|i and |.|2 be level mappings for a logic program P. Let P be 
bounded-recurrent wrt |.|i and |.|2 and let G be a goal which is bounded wrt |.|i 
and |.| 2 . Let G' be an SLD-resolvent of G from P. Then 

1. G' is bounded wrt |.|i and |.| 2 , 

2. |[G']|i Amu/ |[G]|i, and 

3. every SLD-derivation of P U G} is finite. 

Proof. Assume Aj is the selected literal in G =<— Ai,. . . , Am and the used 
clause is c : H ^ Bi, . ■ ■ , Bn (n > 0). Then G' =<— {Ai , . . . , Aj-i, Pi, ... , P„, 
Ajj^i , . . . , Am)0 where 9 S mgu{Aj , H) . 
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1. Since G is bounded wrt |.|i and |.| 2 , it follows that Ak and AkO are bounded 

wrt |.|i and |.|2 for all k G In particular, Aj9 = H6 is bounded wrt 

|.|i and |.| 2 - It follows, by definition 14, that Bi6 is bounded wrt |.|i and |.|2 
for all i G [l,n] and hence G' is bounded wrt |.|i and |.| 2 . 

2. Moreover, |[7lfc]|i > |[Afe6*]|i for all k G [l,m] by lemma 1. Finally, for all 
i G [1, n] 

(a) if rel{Aj) = rel{H) ~ rel{Bi), by definition 14, and 

(b) rel{BiO) tl rel{Aj) otherwise. 

Hence {rel{Bi9),\[Bi9]\i) -< {rel{Aj),\[Aj]\i) for all i G [l,n] and also {rel 
{Ak9),\[Ak9]\i) ^ (rel{Ak), \[Ak]\i) for all k G [l,m] thereby proving |[G"]|i 

-<mul |[G]|l. 

3. Since ^mui is well-founded the result follows immediately. 

Corollary 3. Every bounded-recurrent program is terminating. 

Theorem 4. Let |.|i and |.|2 be level mappings for a logic program P. The 
following hold. 

1. If P is recurrent wrt |.|i then P is bounded-recurrent wrt |.|i and |.|i. 

2. If P is bounded-recurrent wrt |.|i and |.| 2 , then there exists a level mapping 
|.|3 such that P is recurrent wrt |.| 3 . Moreover, for any atom A, A is bounded 
wrt |.|3 if H is bounded wrt |.|i and |.| 2 . 

Proof. Let c : iL ^ Pi, . . . , P„ be a clause in P. Suppose P is recurrent wrt |.|i. 
Let 6* be a substitution such that H9 is bounded wrt |.|i. Then Bi9 is bounded 
and |[P6*]|i > |[Pi0]|i for alH G [l,n] by recurrency. The second part follows by 
lemma 3 and theorem 2.2 and corollary 2.2 of [5]. 

7 Bounded-Acceptability 

The definition of bounded-recurrency is easily adapted to obtain a characterisa- 
tion of left-terminating programs. 

Definition 15 (bounded-acceptability). Let |.|i and |.|2 be level mappings and 
I an interpretation for a logic program P. A clause c \ H <— B\, . . . , Bn- is 
bounded-acceptable (wrt |.|i, |.|2 and I) iff 

1. / is a model for c and 

2. for all i G [l,n] and for every substitution 9 such that H9 is bounded wrt 
|.|i and |.| 2 , {Pi, . ■ . , Bi-i}9 is ground and / ^ jPi, . . . , Pi_i}0 it follows 
that 

(a) Bi9 is bounded wrt |.|i and |.|2 and 

(b) |[P0]|i > |[Pi0]|i whenever rel{H) ~ rel{Bi). 

P is bounded-acceptable (wrt |.|i, |.|2 and I) iff every clause in P is bounded- 
acceptable (wrt |.|i, |.|2 and I). □ 
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Lemma 4 asserts that every bounded-acceptable program is left-terminating. 
The proof of this follows along the same lines as that for acceptable programs. 

Lemma 4. Let |.|i and |.|2 be level mappings and I an interpretation for a 
program P. Let P be bounded-acceptable wrt |.|i, |.|2 and I, and let G be a goal 
which is left-bounded wrt |.|i and I and wrt |.|2 and I . Let G' be an LD-resolvent 
of G from P. Then 

1. G' is left-bounded wrt |.|i and I and wrt |.|2 and I, 

2. |[G']/|i Amu/ |[G]/|i and 

3. every LD-derivation of P U G} is finite. 

Proof. Let G =<— At^, A\, . . . , Am (m > 0) and assume c : H ^ Pi,...,P„ 
(n > 0) is the program clause used. Then G' =<— (Pi, . . . , P„, Ai, . . . , 
where 9 e mgu{Ao, H). 

1. It is necessary to show for all j € [1,2], i G [l,n-|- m] that |[G']}|j is finite. 
Firstly, for all j G [1,2], i G [l,n] 



|[G']}li = |[^ (Pi,...,P„,Ai,...,A„) 0 ]}l, 
grounds G' 
h{Pi,...,P.-iW 
(j) grounds {Pi , . . . ,1 
/[={Pi,...,P,_i }0 
cr grounds Bi 6 <j) 

Now by definition 15 , for all i G [l,n], for every substitution (j) such that 
H 9 (f) is bounded wrt |.|i and |.|2, (Pi, . . . , Pi_i} 0 (() is ground and I ]= 
|Pi,...,P,_i} 0 <^ 

(a) Bi 6 (j) is bounded wrt |.|i and |.|2 and 

(b) |[P 0 (/)]|i > |[Pi 0 (/)]ji whenever rel{H) ~ rel{Bi). 

Hence, |[G']j|j is finite for all i G [l,n], j G [ 1 , 2 ]. Now for all j G [ 1 , 2 ], 
k G [ 1 , to] 





\[G']r% = i[- (Pi, . . . , p„, Ai, . . . , Am) 9 ]r% 



= < \Ak9(fi\j 

c j \Ak6ip\j 



(p grounds G' 

I \= {Pi) • • ■ ) P") ^1) • ■ • ) 

(f) grounds (P, Hi, ... , Am }9 
I ^ (P, Hi, ... , Ak-i}9ip 



= |[^ (Ho,Hi,...,H„)0]J 



fe+i I 



c 



(Ho, Hi, ... , Am)]j~^^ \j 



Since G is left-bounded wrt |.|i and I, and wrt |.|2 and I, then |[G']"'''^|j is 
finite for all k G [ 1 , to], j G [ 1 , 2 ]. 
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2. It follows directly that for all k G j G [1,2], max |[G']"'''^|j < max 

|[G]j'''^|j and for all i G [l,n], whenever rel(Ao) = rel{H) ~ rel{Bi) 

max|[G']}|i < max{|iL0(^|i | (j) grounds H9} 

= max{|^o6^'?^'|i I grounds Aq9} 

= maxjl^ ^o6']}|i 
< maxj[^ -4o]}|i 
= maxj[G]}|i 

Hence (re?(i?i61), max | [G']}|i) ^ (re?(Ho), max |[G]}|i) for all i G [l,n] and 
(rG(Hfe61),max|[G']”'''^|i) ^ (reZ(Hfc), max |[G]j''"^ |i) for all /c G [l,m] there- 
by proving |[G']/|i ^mui |[G]/|i. 

3. Since ^mui is well-founded the result follows immediately. 

Corollary 4. Every bounded-acceptable program is left-terminating. 



Theorem 5. Let j.ji and |.|2 be level mappings and I an interpretation for a 
program P. The following hold. 

1. If P is acceptable wrt j.ji then P is bounded-acceptable wrt j.ji and j.ji. 

2. If P is bounded-acceptable wrt j.ji and |.| 2 , then there exists a level mapping 

j.js such that P is acceptable wrt j.ja. Moreover, for any atom A, A is bounded 
wrt j. Is if H is bounded wrt j.ji and |.| 2 . □ 

Proof. Let c : H ^ Pi, . . . , be a clause in P. Suppose P is acceptable wrt 

j.ji. Then / is a model for c. Let 0 be a substitution such that H9 is bounded wrt 
j.ji, {Pi, . . . , Bi-i}9 is ground and / ^ (Pi, . . . , Pi_i}0. Then Bi9 is bounded 
wrt j.ji and |[P^0]|i > |[Pi6l]|i by acceptability and lemma 2. The second part 
follows by lemma 4 and theorem 4.18 of [2]. □ 

Note that the proof of theorem 5, like that for theorem 4, does not need to 

directly specify the relationship between j.ji, |.|2 and j.js (though it would be 
interestingly to understand this connection). 

8 Discussion 

The concept of bounded-acceptability proposed here is quite similar to that of 
rigid-acceptability defined by [13,16]. This latter notion forms the basis of a 
practical, demand-driven termination analysis. The analysis is essentially top- 
down, attempting to prove termination for a set of queries S. An important step 
in the analysis is the calculation of the call set Call{P,S), the set of all calls 
which may occur during the derivation of an atom in S. The analysis focuses on 
the recursive components to derive a level mapping |.|, enforcing boundedness 
of sub-computations by imposing a rigidity constraint on the call set. That is, 
during the derivation of |.|, every atom in Call{P,S) is required to be rigid 
wrt |.|. 
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For program specialisation, and partial deduction in particular, it is more 
useful to derive sufficient termination conditions for individual predicates rather 
than proving that a given top-level goal will terminate [10]. The reason is that the 
overall computation is unlikely to be left-terminating but some sub-computations 
probably will be. The required conditions can be derived in a bottom-up manner 
on the strongly connected components of the predicate dependency graph. The 
notion of bounded-acceptability lends itself naturally to this process. 

In [14], the analysis of [13] is adapted to obtain the above mentioned condi- 
tions. It attempts to derive for each predicate a maximal set S of left-terminating 
queries. Essentially, this amounts to deriving a level mapping j.j which defines 
S, in that an atom A is in S' if and only if A is bounded wrt j.j. However, an im- 
portant step is omitted from the paper, and the set S may contain queries which 
are not left-terminating. The level mapping j.j is derived by only considering the 
recursive components of the program and thus corresponds to the level mapping 
1 . 1 1 in the definition of bounded-acceptability. Sub-computations are no longer 
guaranteed to start from bounded goals since no rigidity constraint is placed on 
the level mapping during its derivation as in [13] : specifically, this is because the 
set Call{P, S) is unknown since S is unknown (the idea after all being to derive 
S), and as a result no rigidity constraint can be imposed on Call{P,S). Hence, 
in relation to the current work, the missing step is the derivation of the second 
level mapping |.| 2 . The maximal set S' C S of left-terminating queries then, 
contains only those atoms which are bounded wrt j.ji and |.| 2 . Note that j.j 2 can 
be derived entirely independently of | . 1 1 , in the sense that there is never any need 
to alter the definition of j.ji in order to obtain a definition of j.j 2 which can be 
used to prove bounded-acceptability. Thus the notion of bounded-acceptability 
allows the set S' to be easily constructed from S without requiring any change 
to the method of [14] . 

Recently, Bossi et al [6] have developed an entirely modular approach to 
termination in which acceptability is proven on a module-by-module basis by 
choosing a natural level mapping that focuses solely on the predicates defined 
within the module. The key concept is strong boundedness. A query to a pro- 
gram that is defined over n modules , . . . , is said to be strongly bounded if 
each call to a predicate that is defined in module Ri is bounded with respect to 
the level mapping for that module j.ji. A sufficient condition for left-termination 
of a strongly bounded query is for each module Ri to be acceptable with respect 
to its own level mapping j.ji and a model of the whole program. Observe, how- 
ever, that strong boundedness is a property of derivations rather than a model. 
The authors, however, argue that the approach is still attractive because strong 
boundedness can be verified by approximating call-patterns by goal-dependent 
abstract interpretation [8j. Moreover, well-moded [17] and well-typed logic pro- 
grams [7] are in some sense well-behaved with respect to strong boundedness 
and thereby provide another route for asserting strong boundedness. This work 
is applicable to general logic programs and therefore generalises the modular ter- 
mination proofs for well-moded definite programs originally proposed in [18] . By 
way of contrast, the concept of bounded-acceptability paper does rely on either 
call-pattern approximation or the program being well-moded or well- typed. 
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In summary, the notions of bounded-recurrency and bounded-acceptability 
enable a more mechanistic approach to be taken to the construction of level 
mappings since these concepts finesse some of the complications that arise in the 
construction of classic termination proofs. Level mappings that directly relate 
to the recursive structure of the program, as well as being more intuitive for a 
human, are bound to be easier to synthesise for a machine. 



Acknowledgements This work was supported, in part, by the EPSRC stu- 
dentship 93315269; in fact much of this paper is adapted from chapters 3 and 6 
of [22] . The work has benefited for useful discussions with Danny De Schreye and 
Fred Mesnard. The authors gratefully acknowledge Nuffield grant SCI/180/94/ 
417/G and EPSRC grant GR/M08769 for funding their collaboration. 

References 

1. K. R. Apt and M. Bezem. Acyclic Programs. New Generation Computing, 
9(3/4):335-364, 1991. 

2. K. R. Apt and D. Pedreschi. Reasoning about Termination of Pure Prolog Pro- 
grams. Information and Computation, 106(1):109-157, 1993. 

3. K. R. Apt and D. Pedreschi. Modular Termination Proofs for Logic and Pure 
Prolog programs. In G. Levi, editor, Advances in Logic Programming Theory, 
pages 183-229. Oxford University Press, 1994. Also available as technical report 
CS-R9316 from Centrum voor Wiskunde en Informatica, CWI, Amesterdam. 

4. M. Bezem. Characterizing Termination of Logic Programs with Level Mappings. 
In E. L. Lusk and R. A. Overbeek, editors. North American Conference on Logic 
Programming, pages 69-80. MIT Press, 1989. 

5. M. Bezem. Strong Termination of Logic Programs. The Journal of Logie Program- 
ming, 15(l&2):79-97, 1993. 

6. A. Bossi, N. Cocco, S. Etalle, and S. Rossi. On Modular Termination Proofs of 
General Logic Programs. Theory and Practice of Logic Programming, 2(3):263-291, 
2002 . 

7. F. Bronsard, T. K. Lakshman, and U. S. Reddy. A Framework of Directionality for 
Proving Termination of Logic Programs. In K. R. Apt, editor, Joint International 
Conferenee and Symposium on Logic Programming, pages 321-335. MIT Press, 
1992. 

8. M. Bruynooghe. A Practical Framework for the Abstract Interpretation of Logic 
Programs. The Journal of Logic Programming, 10(l/2/3&4):91-124, 1991. 

9. M. Bruynooghe, M. Codish, S. Genaim, and W. Vanhoof. Reuse of Results in 
Termination Analysis of Typed Logic Programs. In M. V. Hermenegildo and 
G. Puebla, editors, Static Analysis Symposium, number 2477 in Lecture Notes in 
Computer Science, pages 477-492. Springer- Verlag, 2002. 

10. M. Bruynooghe, M. Leuchel, and K. F. Sagonas. A Polyvariant Binding-time 
Analysis for Off-line Partial Deduction. In C. Hankin, editor, European Symposium 
on Programming, volume 1381 of Lecture Notes in Computer Science, pages 27-41. 
Springer- Verlag, 1998. 

11. L. Cavedon. Continuity, consistency, and completeness properties of logic pro- 
grams. In G. Levi and M. Martelli, editors. International Conference on Logic 
Programming, pages 571-584. MIT Press, 1989. 




452 Jonathan C. Martin and Andy King 



12. D. De Schreye, K. Verschaetse, and M. Bruynooghe. A Framework for Analysing 
the Termination of Definite Logic Programs with Respect to Call Patterns. In 
International Conference on Fifth Generation Computer Systems, pages 481-488. 
lOS Press, 1992. 

13. S. Decorte and D. De Schreye. Demand-driven and constraint-based automatic 
left-termination analysis of logic programs. In L. Naish, editor, International Con- 
ference on Logic Programming, pages 78-92. MIT Press, 1997. 

14. S. Decorte and D. De Schreye. Termination analysis: Some practical properties 
of the Norm and Level Mapping Space. In J. Jaffar, editor. Joint International 
Conference and Symposium on Logic Programming, pages 235-249. MIT Press, 
1998. 

15. S. Decorte, D. De Schreye, and M. Fabris. Automatic Inference of Norms: A Missing 
Link in Automatic Termination Analysis. In D. Miller, editor. International Logic 
Programming Symposium, pages 420-436. MIT Press, 1993. 

16. S. Decorte, D. De Schreye, and H. Vandecasteele. Constraint-based Termination 
Analysis of Logic Programs. ACM Transactions on Programming Languages and 
Systems, 21(6):1137-1195, 1999. 

17. P. Dembihski and J. Maluszyhski. And-Parallelism with Intelligent Backtracking 
for Annotated Logic Programs. In Symposium on Logic Programming, pages 29-38. 
IEEE Press, 1985. 

18. S. Etalle, A. Bossi, and N. Cocco. Termination of Well-Moded Programs. The 
Journal of Logic Programming, 38(2):243-257, 1999. 

19. S. Genaim, M. Codish, J. P. Gallagher, and V. Lagoon. Gombining Norms to Prove 
Termination. In A. Cortesi, editor, Verification, Model Checking and Abstract 
Interpretation, number 2294 in Lecture Notes in Gomputer Science, pages 126- 
138. Springer- Verlag, 2002. 

20. V. Lagoon, F. Mesnard, and P. Stuckey. Termination Analysis with Types is More 
Accurate. In C. Palamidessi, editor, International Conference on Logic Program- 
ming, Lecture Notes in Gomputer Science. Springer- Verlag, 2003. 

21. J. Lloyd. Foundations of Logic Programming. Springer- Verlag, 1987. 

22. J. G. Martin. Judgement Day: Terminating Logic Programs. PhD thesis. Depart- 
ment of Electronics and Gomputer Science, University of Southampton, 2000. 

23. J. G. Martin, A. King, and P. Soper. Typed Norms for Typed Logic Programs. 
In J. Gallagher, editor, Logic Program Synthesis and Transformation (Selected Pa- 
pers), volume 1207 of Lecture Notes in Computer Science, pages 224-238. Springer- 
Verlag, 1997. 

24. J. G. Martin and M. Leuschel. Sonic Partial Deduction. In D. Bjprner, M. Broy, and 
A. V. Zamulin, editors. Third International Andrei Ershov Memorial Conference, 
volume 1755 of Lecture Notes in Computer Science, pages 101-112, 1999. 

25. D. Pedreschi and S. Ruggieri. Verification of Logic Programs. The Journal of Logic 
Programming, 39(1-3):125-176, 1999. 

26. J. Van Leeuwen, editor. Handbook of Theoretical Computer Science: Volume B. 
Elsevier, 1990. 

27. W. Vanhoof and M. Bruynooghe. When Size Does Matter. In A. Pettorossi, editor. 
Logic Based Program Synthesis and Transformation (Selected Papers), volume 2372 
of Lecture Notes in Computer Science, pages 129-147, 2001. 

28. S. Verbaeten, D. De Schreye, and K. F. Sagonas. Termination Proofs for Logic 
Programs with Tabling. ACM Transactions on Computational Logic, 2(l):57-92, 
2001 . 




Proving Termination for Logic Programs 
by the Query-Mapping Pairs Approach 



Naomi Lindenstrauss^, Yehoshua Sagiv^, and Alexander Serebrenik^ 



^ School of Computer Science and Engineering, The Hebrew University 
Jerusalem 91904, Israel 
{naomil , sagiv}@cs . huj i . ac . il 
^ Ecole Polytechnique (STIX), 91128 Palaiseau Cedex, France 
Alexander . SerebrenikSstix.polytecnique . fr 



Abstract. This paper describes a method for proving termination of 
queries to logic programs based on abstract interpretation. The method 
uses query-mapping pairs to abstract the relation between calls in the 
LD-tree associated with the program and query. Any well founded partial 
order for terms can be used to prove the termination. The ideas of the 
query-mapping pairs approach have been implemented in SICStus Prolog 
in a system called TermiLog, which is available on the web. Given a 
program and query pattern the system either answers that the query 
terminates or that there may be non-termination. The advantages of the 
method are its conceptual simplicity and the fact that it does not impose 
any restrictions on the programs. 



1 Introduction 

In this paper we describe a method for proving termination of queries to logic 
programs based on abstract interpretation. The results of applying the ideas of 
abstract interpretation to logic programs (cf. [24]) seem to be especially elegant 
and useful, because we are dealing in this case with a very simple language which 
has only one basic construct — the clause. Termination of programs is known to 
be undecidable, but in the case of logic programs, which have a clearly defined 
formal semantics and in which the only possible cause for non-termination is 
infinite recursion, it is possible to prove termination automatically for a large 
class of programs. 

Given a logic program, that is a finite set of clauses of the form 

A : — Bi, B 2 , ■ ■ ■ , Bn 

where A, B\^ B 2 , ■ ■ . Bn are atoms, n > 0, and a query, which is an atom, we 
want to find, if possible, substitutions for the variables of the query which make 
it a logical consequence of the program (if there are no variables in the query 
we just want to show that it is a logical consequence of the program). To do so 
we usually use SLD-resolution to compute answers to the query (for all these 
notions see [1,48]). 
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A crucial question in this case is whether the computation terminates. The 
way the computation proceeds depends on the choice of the atom in the goal 
on which resolution is performed at each step, and on the choice of the clause 
with which it is resolved. Termination irrespective of which atom and clause 
are chosen is called strong termination (cf. [12]) and means that all SLD-trees 
constructed for the program and the query are finite. One can consider a weaker 
notion of termination, where one can choose any clause, but the choice of atom 
is determined by Prolog’s computation rule: always choose the leftmost atom 
in a goal to resolve upon. This notion amounts to finiteness of the SLD-tree 
constructed according to Prolog’s computation rule, which is called the LD-tree 
(cf. [1]). A still weaker notion is 3-termination, where one assumes that there is 
at least one way to choose atoms so that for any way of choosing clauses there 
is termination (cf. [65,57]). All notions of termination where one can choose 
any clause fall in the category of universal termination, because one computes 
all answers. The weakest notion of termination is existential termination, which 
means that there either is finite failure or at least one way to choose atoms and 
clauses, so that there is a succesful derivation (cf. [80,8,49]). 

All these kinds of termination are undecidable (this follows from the fact 
that the operation of a Turing machine can be described by a logic program, 
and the halting problem for Turing machines is undecidable). Nevertheless it 
is important to find sufficient conditions for termination that can be verified 
automatically. 

We consider the second notion of termination mentioned above, namely ter- 
mination of computing all answers using the leftmost computation rule of Prolog. 
This notion of termination is also known as LD-termination (cf. [32]). Observe, 
that finding all answers in finite time is essential, even if one seems to be inter- 
ested in finding a single answer only. The query we solved may be backtracked 
into and it is crucial that there will be termination also in that case (cf. the sec- 
tion on ’Backwards Correctness’ in [56]). Moreover, Prolog’s built-in predicates 
findall, setof, bag of depend on computing all answers to the query they include. 

One of the difficulties when dealing with the LD-tree for a query, given a logic 
program, is that infinitely many different atoms may appear as subgoals. The 
basic idea is to abstract this possibly infinite structure to a finite one. The query- 
mapping pairs method (cf. [66,44]) uses a certain kind of graphs to abstract the 
relation between arguments of calls in the LD-tree associated with the program 
and query. The method has been implemented in SICStus Prolog ([68]) in a 
termination analyzer called TermiLog (cf. [46]), which is available on the web 
([74]). Given a program and query pattern the analyzer either answers that the 
query terminates or that there may be non-termination. 

TermiLog was, as far as we know, the first publicly available automatic tool 
for proving termination of logic programs. It is based on one clear and simple 
idea — the notion of query-mapping pair — and the use of Ramsey’s theo- 
rem. This paper presents both the theoretical framework, which can be used for 
any well-founded order, and its application for linear norms, as implemented in 
TermiLog. The paper is completely self-contained. For instance the instantiation 
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analysis, which is an extension of groundness analysis, and the constraint infer- 
ence are developed from the basics. The whole development of TermiLog was 
done within the framework of standard Prolog. This and the inherent simplicity 
of the approach make the system very flexible and various optimizations can be 
easily added. 

The remainder of the paper is organized as follows. In Section 2 the key 
notion of the approach, query -mapping pairs, is introduced. Query-mapping pairs 
are defined relative to a partial mapping (j) from terms to a strictly ordered 
well-founded set. Any partial mapping (j) from terms to a strictly ordered well- 
founded set can be taken. In Section 3 the Basic Theorem is proved with the 
help of Ramsey’s Theorem. From the Basic Theorem a sufficient condition for 
termination, formulated in terms of query-mapping pairs, is derived. An example 
of using the condition is given for a suitable mapping (j). However, in order to 
automate the process of proving termination, we cannot let the choice of (j) be 
determined by ingenuity, but need a uniform way of constructing it. This is 
done by means of linear norms. Section 4, which comprises the main part of the 
paper, explains the theoretical foundation for the use of query-mapping pairs 
based on linear norms, as it is implemented in the TermiLog system ([74,46]). 
First linear norms and the more general symbolic linear norms are defined. Then 
the weighted rule graph that corresponds to a rule^ of the program is defined. 
This graph is a main tool in the construction of query-mapping pairs and in 
the constraint inference (Subsection 4.7). Then it is explained how to obtain all 
the query-mapping pairs relevant to a program and a query by the processes of 
generation and composition. Next, it is explained how instantiation analysis and 
constraint inference, which are used in the construction of the query-mapping 
pairs, are performed, by a process of bottom-up abstract interpretation. We 
then give some information about the implementation and the experimental 
evaluation of the system. The paper ends with some information about related 
work and the development of the query-mapping pairs approach. 



2 LD-Trees and Query-Mapping Pairs 

To prove termination we use partial mappings from terms to strictly ordered 
well-founded sets. A set S is called strictly ordered if there is a binary relation 
> defined on it that is transitive (i.e. if x, y, z are in S then x > y,y > z implies 
X > z) and asymmetric (i.e. if x, y are in S then x > y implies that y > x cannot 
hold). Note that this also means that it cannot be reflexive. A strictly ordered set 
(5, >) is called well-founded if there are no infinite sequences xi > X 2 > x^ > . . . 
of elements of S. The way to prove termination will be to show that if there was 
non-termination one could produce an infinite descending sequence of elements. 

We will consider partial mappings (j) from terms to a strictly ordered well- 
founded set {S, >). Most often this will be the set of non-negative integers with 
the usual order, but any other strictly ordered well-founded set can be used. We 

A rule is a clause with non-empty body. 
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will require these mappings to be compatible with substitutions in the sense of 
the following definition. 

Definition 2.1 (Substitution-Compatible Mapping). A partial mapping <j) 
defined for (not necessarily all) terms is called substitution-compatible if when- 
ever 4>(T) is defined for a term T and 9 is a substitution, then 4>(T9) is defined 
too and is equal to 4>{T). ^ 

We now proceed to define a relation between nodes in the LD-tree that ’call’ 
each other. 

Definition 2.2. Let P he a program and Q be a query. Let 
and ^ Qi, . ■ . ,Qm be nodes in the LD-tree of P and Q. We say that node 
^ Qi,. . . , Qn is a direct offspring of node ^ Pi, ... , Pm if Pi has been resolved 
with a clause c in P and Q\ is, up to a substitution, a body subgoal of c. 

Soppose we assign to each node in the LD-tree a unique natural number (it 
does not matter how, as long as there is a one-to-one correspondence between 
nodes and numbers), and suppose that if node (k) is resolved with the clause 
instance A ^ Pi,...,P„ we add a subscript (k) to all the predicates of the 
atoms Bj,l < j < n. Then, if the subscript of the predicate of the first atom of 
node {1) is (m) this means that node {1) is a direct offspring of node (to) 

Definition 2.3 (Offspring Relation and Call Branch). We define the off- 
spring relation as the non-reflexive closure of the direct offspring relation. We 
call a path between two nodes in the tree such that one is the offspring of the 
other a call branch. ® 

Take for example the program for computing Ackermann’s function with the 
goal ack(s(0) ,s(s(0)) ,A) : 

Example 2.1. 

(i) ack(0,N,s(N)) . 

(ii) ack(s(M) ,0,A) ack(M,s(0) ,A) . 

(iii) ack(s(M) ,s(N) ,A) ack(s(M) ,N,A1) , ack(M,Al,A). 

□ 

The LD-tree is given in Figure 1. Note that the predicate of each atom in 
the LD-tree has a subscript that reports who is its ’parent’, that is the node in 
the LD-tree that caused this atom to be called as the result of resolution. Node 

^ The reader may notice a similarity between this notion and the notion of a rigid norm 
(cf. for example [25]). The difference is that a norm is defined for all terms. Here the 
mapping is partial and a crucial part of the requirements for it to be substitution- 
compatible is, that it be defined for TO, for any substitntion 9, if it is defined for 
T. 

® There is affinity between our offspring relation and the descendant relation in [69], 
however the relation here is defined for nodes in the LD-tree, while the relation there 
is defined for atoms. 
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(2) and node (6) are, for instance, direct offspring of node (1), because the first 
atoms in their respective goals come from the body of clause (iii), with which 
the goal of node (1) was resolved. Node (3), for instance, is an offspring of node 
(1), but not a direct offspring. 



(1) ^ acfc(s(0), s(s(0)), yl) 

(2) ^ acfc(i)(s(0),s(0),Al),acfc(i)(0,2ll,2l) 

(3) <— ack( 2 ) (s(0), 0, A2), ack( 2 ) (0, A2, Al),ack(i) (0, Al, A) 

(4) ^ ack(3) (0, s(0), A2), ack(2) (0, A2, Al), acfc(i) (0, Al, A) 

{A2 s(s(0))} 

(5) ^ acfc(2)(0, s(s(0)), Al),ocfc(i)(0, Al, A) 

{Al s(s(s(0)))} 

(6) ^ acfc(i)(0,s(s(s(0))). A) 

{A^s(s(s(s(0))))} 

(7) ^ 

Fig. 1. LD-tree of ack 

A graphical representation of the direct offspring relation is given in Figure 2. 




Fig. 2. The offspring relation of ack 



In this example, to which we will return later, there is only one branch in the 
LD-tree. To give an example where the tree consists of more than one branch, 
take the program pqrst: 

Example 2.2. 

p(X) q(X), r(X). 

p(X) s(X) . 

s(X) t(X) . 

q(a) . r(a). t(b). 

□ 
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In this case the LD-tree is given in Figure 3 and a representation of the 
offspring relation is given in Figure 4. 



(1) ^p{X) 




(2) ^g(i)(X),r(p(X) 


(5) ^S(i)(A) 


{X ^ a} 


(6) ^t(5)(X) 


(3) ^ »'(i)(a) 


{X ^ b} 


(4) ^ 


(7) ^ 



Fig. 3. LD-tree of pqrst 



( 1 ) 




(6) 



Fig. 4. The offspring relation of pqrst 



In order to give information about call branches in the LD-tree we will use 
argument mappings, or mappings in short. 

Definition 2.4 (Argument Mapping). An argument mapping is a mixed 
graph, that is a graph with both arcs and edges, whose nodes correspond to argu- 
ment positions of some atoms (possibly just one atom). A node corresponding to 
an argument position of an atom is labeled by the predicate of the atom and by 
the argument number. Nodes are either black or white. Nodes connected by an 
edge must be of the same color. Nodes connected by an arc must be black. 

The intuition behind this definition is the following: as we said in the be- 
ginning of this section we will use partial mappings (f> from terms to a strictly 
ordered well-founded set. Nodes of the argument mapping correspond to argu- 
ments of atoms. The color of a node, black or white, will be used to depict 
whether (j) is defined or, respectively, not known to be defined for the argument. 
An arc going from one node to another depicts the fact that the value of 4> for 
the argument corresponding to the first node is greater than the value for the ar- 
gument corresponding to the second. Nodes connected by an arc must therefore 
be black, because (j) is defined for the arguments they represent. An edge can 
either connect two black nodes for which (j) is defined and the values are equal, 
or two nodes for which the corresponding arguments are identical (and then (f> 
is defined or undefined for both). Suppose we have the rule 

p(X,Y) p(a,Y), q(b,X) . 

and (f){a) = 1, 4>{b) = 2. 




Proving Termination for Logic Programs 459 



Then we can depict the relations between the arguments of the atoms appearing 
in the rule by 




In order not to clutter the picture we will usually omit some of the labels and 
imply them by the layout of the graph: 

p o o 

p • \ 6 

q o 

One mapping may be more ’general’ than another. This brings us to the 
definition of subsumption. 



Definition 2.5 (subsumption). Given two mappings Gi and G 2 , we say that 
Gi subsumes G 2 if they have the same nodes up to color, every node that is 
black in Gi is also black in G 2 , and every edge or arc between nodes in G\ also 
appears for the respective nodes in G 2 ■ 



The intuition behind this definition is that we will assume that 4> is substi- 
tution-compatible, so when we apply a substitution to an atom, the arguments 
for which (f> was defined will continue to be defined and have the same value, 
while it may happen that (j) will be defined for more arguments. If we have a 
mapping, and apply a substitution to the arguments it represents, the resulting 
mapping will be subsumed by the original mapping. To give an example: 




subsumes 




We now define basic query-mapping pairs. Basic query-mapping pairs give 
information about the size relations between the arguments of the first atoms of 
two nodes such that one is the direct offspring of the other. 



Definition 2.6 (Basic Query- Mapping Pairs). Let (j) be a partial mapping 
from terms to a strictly ordered well-founded set {S, >) that is substitution- 
compatible. Let a logic program and a query be given. Suppose ^ P\,...,Pm 
and ^ Qi, . . . , Qn are two nodes in the LD-tree such that the second is a direct 
offspring of the first. Let 9 be the composition of the substitutions along the path 
between these two nodes. 

A basic query-mapping pair corresponding to these two nodes consists of two 
parts: 






460 Naomi Lindenstrauss, Yehoshua Sagiv, and Alexander Serebrenik 

— The query pattern, that is a mapping whose nodes eorrespond to the argu- 
ment positions of Pi = p(ti , . . . ,tk) and are either blaek, if we know that 4> is 
defined for the eorresponding argument, or white, if we don’t know whether (f> 
is defined for the eorresponding argument. If we know that the value of (j) is 
equal for the i ’th and j ’th arguments or that the arguments are identical, the 
graph will include an edge from the i ’th to the j ’th position. If we know that 
4>{ti) > 4>{tj), the graph will include an arc from the i ’th to the j ’th position. 

— A mapping, whose nodes correspond to the argument positions of Pi6 and 
the argument positions of Q\. Again nodes can he black or white and there 
can he edges and arcs between them, with the meaning as above. We call the 
nodes corresponding to the argument positions of P\9 with the edges and arcs 
between them the domain of the mapping, and the nodes coresponding to the 
argument positions of Q\ with the edges and arcs between them the range of 
the mapping. 

Since (j) is assumed to be substitution-compatible, the query pattern of a query- 
mapping pair subsumes the domain of its mapping. Note that the information 
given by the query-mapping pair does not have to be complete — the important 
thing is that it be sound and hopefully sufficient for the proof of termination. 
The query describes abstractly the atom Pi . The mapping describes the relation 
between the atoms Pi9 and Qi- By the time we arrive at Qi certain substitutions 
will have been applied to the variables of Pi and we want to take them into 
account in order to have <j), which we will use to prove the termination, be 
defined on as many nodes as possible. 

We use the following notation for the query-mapping pairs: for the query 
pattern we first give an atom with the predicate of Pi and arguments b or /, 
depending on whether the corresponding nodes are black or white, and then a 
list of the edges and arcs in the form eq{i,j) for an edge between the z’th and 
j’th argument position and gt{i,j) for an arc from the z’th to the j’th argument 
position; for the part of the mapping we use a pictorial representation. 

If we take node (1) and node (2) of the LD-tree for Ackermann’s function 
and assume that (f> gives for each number written in successor notation its cor- 
responding (nonnegative) value, we get the basic query-mapping pair 



query: ack(b,b,f) 


[gt(2,l)] 


(i) mapping: ack 


7 \ ° 


ack ^ 


J o 



If we take node (2) and its direct offspring node (3), we get the following 
basic query-mapping pair 

query: ack(b,b,f) [eq(l,2)] 



(ii) mapping: ack 

ack 
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Looking at each pair of nodes in the LD-tree that are direct offspring of each 
other is not practical, because the tree may be infinite. In our case both nodes 
(2) and node (3) originate from resolution with clause (iii) and their first atoms 
originate in the first atom of the body of the clause. So we may just look at the 
relation between the head and first body atom in 

ack(s(M) ,s(N) ,A) ack(s(M) ,N,A1) , ack(M,Al,A). 

Then we get the basic query-mapping pair 



query: ack(b,b,f) 


D 


(iii) mapping: ack , 


' o 


ack i ; 


' o 



Note that this pair subsumes the previous two. 

Basic query-mapping pairs express relations between first atoms of nodes 
that are direct offspring of each other. To express relations between first atoms 
of nodes that are offspring but not direct offspring of each other we compose 
query-mapping pairs. 

We have seen that for some mappings we have defined domain and range. 
For such mappings we define composition. 

Definition 2.7 (Composition of Mappings). Let /i and v he mappings with 
domain and range. If the range of pt and the domain of v are labeled by the 
same predicate, then the composition of the mappings p, and v, denoted p o u, 
is obtained by unifying each node in the range of p with the corresponding node 
in the domain of u ( two nodes are corresponding if they correspond to the same 
argument position). When unifying two nodes, the result is a black node if at 
least one of the nodes is black, otherwise it is a white node. If a node becomes 
black, so do all nodes connected to it with an edge. The nodes of the domain of 
po u are the nodes of the domain of p, and the nodes of the range of po v are 
the nodes of the range ofw. The edges and arcs of pov consist of the transitive 
closure of the union of the edges and arcs of p and v. 

The following figure shows two mappings (left) and their composition (right). 




Definition 2.8 (Consistency). A mapping is consistent if it has no positive 
cycle (a positive cycle is a cycle consisting of edges and at least one arc where 
all arcs are in the same direction). 
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Note that a mapping that expresses relations in an LD-tree must be consistent. 
Indeed, assume that this is not the case, that is, there exists an inconsistent map- 
ping expressing relations in the LD-tree. A positive cycle may only have black 
nodes. Because of transitivity we get that for each argument t corresponding to 
a node on the positive cycle we must have 4>{t) < 4>{t), in contradiction to the 
irreflexivity of the order <. 

Definition 2.9 (Summary). If ^ is a consistent mapping with domain and 
range, then the summary of pL consists of the nodes in the domain and range of 
pL and the edges and arcs between these nodes. The summary is undefined if pi is 
inconsistent. 

Definition 2.10 (Composition of Query-Mapping Pairs). Let (7Ti,/xi) and 
(t>' 2 ,M 2 ) be query-mapping pairs, such that the range of pti is identical to tt 2 . The 
composition o/(7Ti,/ii) and (7T2,/i2) is (7Ti,/i), where pt is the summary of piiopi 2 
(and, hence, the composition is undefined if pci o pc 2 is inconsistent). 

By composing the query-mapping pairs (i) on page 460 and (ii) on page 460 
we get the query-mapping pair 

query: ack(b,b,f) [gt(2,l)] | 

mapping: ack 

ack 

By composing the pair (iii) on page 461 with itself we get again the pair (iii), 
that is, it is idempotent. 

Definition 2.11. A query-mapping pair that can be composed with itself, that 
is a pair in which the query is identical to the range of the mapping, is called 
circular . 

The pair (iii) is circular and idempotent. The following pair is circular but not 
idempotent: 

query: p(b,f) 0 | 

mapping: p 

P 

The result of composing it with itself is 



query: 


P(b,f) D 


mapping: 


P p. 




p ; o 
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3 The Basic Theorems 

The following theorem holds: 

Theorem 3.1. If there is an infinite branch in the LD-tree corresponding to a 
program and query then there is an infinite sequence of nodes Ni,N 2 ,... such 
that for each i, fVi+i is an offspring of Ni. 

Proof. Straightforward (cf. [81]). □ 

Suppose we have some way to associate basic query-mapping pairs with call 
branches between nodes that are direct offspring of each other. Suppose that 
this is done in such a way that if N 2 is a direct offspring of and N 3 is a direct 
offspring of N 2 then the range of the mapping of the pair for , N 2 is equal 
to the query of the pair for fV 2 7 .^ 3 -Then we can use composition to associate 
a query-mapping pair with each call branch. Note that composition of query- 
mapping pairs is associative. Note also that given a program there can only be 
a finite number of query-mapping pairs associated with it. 

The finiteness of the set of query-mapping pairs allows us to use the following 
version of Ramsey’s theorem to prove our basic theorem (cf. [29]). We will use 
the following notation: if M is a subset of the natural numbers Af, we will denote 
by M[”1 the set of all subsets of M of cardinality n. 

Theorem 3.2 (Ramsey). Let a be a mapping from to some finite set A. 
Then there is an infinite subset M of M such that a is constant on 

(Cf. [62,39]. For a short self contained proof of this version of the theorem see 
[10], p. 290.) 

Theorem 3.3 (Basic Theorem). Suppose the LD-tree for a program and query 
has an infinite branch. Suppose a substitution- compatible partial mapping 4> from 
terms to a strictly ordered well-founded set is given. Suppose that basic query- 
mapping pairs are assigned to nodes that are direct offspring of each other in 
such a way that if N 2 is a direct offspring of Ni and N 3 is a direct offspring of 
N 2 then the range of the mapping of the pair for Ni, N 2 is equal to the query of 
the pair for N 2 ,N^. In this case a query-mapping pair can be assigned to each 
call path by composing the basic pairs along the path. Then there is a sequence 
of nodes Mi, M 2 , . . . and a query-mapping pair (jT,p,) so that for each i, Mj+i 
is an offspring of Mi, and for each j, k the query-mapping pair corresponding to 
the call branch from Mj to M^ is (7r,/i). The pair (7r,/i) can be composed with 
itself and is idempotent. 

Proof. By Theorem 3.1 there is an infinite sequence of nodes Ni, N 2 , . . . such 
that for each i, W-i-i is an offspring of Ni. The set of query-mapping pairs that 
can be constructed with the predicate symbols of a program is finite. For each 
i < j we get one query-mapping pair in this set. By using Ramsey’s theorem 
for n = 2 we get that there is an infinite subsequence , Ni^ , . . . and a unique 
query-mapping pair (tt, /i) so that for each k < I the pair assigned to the call 
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branch from Ni^ to Ni^ is (7r,/r). Clearly this pair can be composed with itself 
and is idempotent. To see that it is idempotent, take for example the three nodes 
Ni^, Ni^, Ni^. The pair (tt,//) corresponds to each of the three call paths from 
Ni^ to Ni^, from Ni^ to Ni^ and from Ni^ to Ni^. Since the pair corresponding 
to the path from to Ni^ is the composition of the pairs corresponding to the 
paths from Ni^ to Ni^ and from Ni^ to Ni^ we get the idempotence. □ 

From this theorem we get the following sufficient condition for termination 
which is formulated in terms of query-mapping pairs. 

Theorem 3.4 (Sufficient Condition for Termination). Suppose the LD- 
tree for a program and query and a substitution- compatible partial mapping 4> 
from terms to a strictly ordered well-founded setS are given. Consider a complete 
set of query-mapping pairs associated with the tree and the mapping (f ( that is, 
pairs for all nodes that are direct offspring of each other and their compositions) . 
If for every circular idempotent pair there is an arc from an argument in the 
domain to the corresponding argument in the range then the tree must be finite, 
i.e. there is termination for the query with Prolog’s computation rule. 

Proof. From the previous theorem we get that if there is an infinite branch 
there must be an infinite sequence of nodes fVi, A^ 2 , ■ • ■ such that the same cir- 
cular idempotent pair corresponds to each pair of nodes fVi, Since every 

circular idempotent pair contains an arc from an argument in the domain to 
the corresponding argument in the range, this would imply the existence of an 
infinite descending sequence of (f values in contradiction to the assumption that 
S is well-founded. □ 

It turns out that we do not have to construct all query-mapping pairs. Ob- 
viously it is enough to consider only pairs that may participate in the creation 
of a circular pair. 

Definition 3.1 (Predicate Dependency Graph). T/ie predicate dependency 
graph of a program is a graph whose nodes are the predicates of the program and 
which has, for every rule A : — Bi, ... , in the program ( remember that a 
rule is a clause with non-empty body) and every i, 1 < i < n, an arc from the 
predicate of A to the predicate of Bi (cf. [58]). 

We can consider the strongly connected components of this graph. We call a 
strongly connected component trivial if it consists of a single node that has no 
arc going from itself to itself. It is easy to see that if the predicate dependency 
graph has no non-trivial strongly connected component, there can be no recur- 
sion in the program. Also the only pairs that can participate in the creation of a 
circular pair are those for which the predicate of the domain and the predicate 
of the range are in the same non-trivial strongly connected component. 

Definition 3.2 (Recursive Query-Mapping Pair). A query-mapping pair 
is called recursive if the predicates of the domain and range belong to the same 
strongly connected component of the predicate dependency graph. 
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Theorem 3.5 (Optimization of Sufficient Condition for Termination). 

Theorem 3.4 remains true if we only consider the recursive query-mapping pairs. 

If we use Theorem 3.5 instead of Theorem 3.4 there will in general be fewer 
pairs to consider so we will get an answer more quickly. 

Example 3.1. Suppose a finite directed acyclic graph is given by a set of facts 
of the form arc{X, Y) denoting an arc going from X to Y . We can use this 
graph to define a relation X > Y if there is a path of arcs from X to Y. This 
relation is clearly transitive. Because of the acyclicity of the graph, it is also 
asymmetric. Hence, the relation we have defined is an order. Moreover it is 
well-founded because of the finiteness and acyclicity of the graph. Consider the 
program consisting of all the facts of the form arc(_, _) and the clauses 

gt(X,Y) arc(X,Y) . 

gt(X,Y) arc(X,Z), gt(Z,Y). 

and the query pattern gt(/, /). 

If we resolve the goal ^ gt{X, Y) with the second rule we get 

^arc{X,Z),gt{Z,Y). 

After one more step of LD-resolution we get ^ gt{zo, Y), where {X ^ xq,Z ^ 
zq} is a substitution that unifies arc{X, Z) with one of the facts of the program. 
To construct query-mapping pairs we can take as (j) the identity mapping on 
the nodes of the acyclic graph with the order defined above. We get the query- 
mapping pair 





query: 


gt(f,f) 


D 


(1) 


mapping: 


gt 


t 9 






gt 


• o 



(Note, by the way, that in this case the query pattern and the domain are 
different.) Now we have a new query pattern, gt{b, /), for which we get the pair 





query: 


gt(b,f) 


D 


(2) 


mapping: 


gt 


t 9 






gt 


• o 



These are the only recursive query-mapping pairs in this case (if we were 
considering all query-mapping pairs, not only the recursive ones, we would also 
have pairs with predicate gt in the domain and predicate arc in the range). The 
only circular pair is the second one, and it has an arc from the first argument in 
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the domain to the first argument in the range, so we get that there is termination 
for the query pattern gt{f,f). Note that this result is not obvious, since for 
graphs that are not acyclic we could get non-termination of queries matching 
the query pattern. □ 

4 Query-Mapping Pairs Based on Linear Norms 

This section explains the theoretical foundations for the implementation of the 
TermiLog system [74,46]. Since our aim is automatic termination analysis, we 
need a uniform method to create query-mapping pairs associated with a query 
and program in order to test the sufficient condition of Theorems 3.4 and 3.5. 
What we need is a uniform way of ordering terms. In our case, the order on 
terms is defined by means of linear norms. 



4.1 Linear Norms and Symbolic Linear Norms 

For each ground term we define a norm, which is a non-negative integer; note 
that different terms may have the same norm. 

Definition 4.1 (Linear Norms). A linear norm of a ground term 
/(Ti, . . . Tn) is defined recursively as follows 



||/(ri,...r„)|| = c + ^ai||T,|| 

i=l 

where c and a \, . . . ,a„ are non-negative integers that depend only on f/n. Note 
that the definition also applies, as a special case, to constants (which are zero- 
arity function symbols). 

Linear norms generalize earlier norms used in automatic termination analysis. 
In particular, the list size of [76] and the term size of [78] are special cases of 
linear norms. Pliimer used in his work on termination two restricted cases of 
linear norms. One corresponds to the case where all the at are equal to 1 (in [58], 
it is called a “linear norm”) and the other corresponds to the case where each 
Gi is chosen to be either 0 or 1 (in [59], it is called a “semi-linear norm”). 

Since the terms that appear in logic programs are very often nonground, we 
extend the definition of a linear norm to nonground terms by denoting the norm 
of a variable X hy X itself. Thus, a nonground term has a Symbolic Linear 
Norm, which is a linear expression with non-negative coefficients. For example, 
if for a function symbol / of arity 3, the Ui and c are all equal to 1, then the 
symbolic norm of the term f{X, Y,X) is 2X -|- Y -I- 1. We will say that a linear 
expression is in normalized form if each variable appears once in it and also the 
free coefficient appears at most once. So 2X -h 5Y -|- 2 is in normalized form, 
while X X 3Y -I- 2Y -|- 1 -I- 1 is not. 

The idea of associating an integer variable with a logic variable goes back 
to [78]. In [71] what we call symbolic norm for the case of term-size is called 
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structural term size. Some authors define the norm of a variable to be 0 and 
then use the norm only for terms that are rigid with respect to it (cf. [59], [25]). 
In our context it is more convenient to use the symbolic norm. If the symbolic 
norm of a term is an integer then we know that the term is rigid — its norm is a 
constant and cannot change by different instantiations of its variables. 

Definition 4.2 (Instantiated Enough). A (possibly nonground) term is in- 
stantiated enough with respect to some linear norm if its symbolic norm is an 
integer. 

Instantiated-enough terms are essentially rigid terms in the terminology 
of [59,25]; that is, terms that cannot change their sizes due to further unifi- 
cations. 

For terms t that are instantiated enough we will take 4>{t) = ||t||. Thus terms 
that are instantiated enough will be mapped into the well-founded set of the 
non-negative integers. In this context black nodes will correspond to arguments 
that are instantiated enough. 

As an example for a linear norm, consider the list-size norm defined for list 
terms as 

||[ii|r]|| = i + ||T|| 

that is, c = 1, ai = 0 and 02 = 1, while for all other functors the norm is 0. In 
this case, the norm is a positive integer exactly for lists that have a finite positive 
length, regardless of whether the elements of those lists are ground or not. Thus, 
all finite lists are instantiated enough with respect to the list-size norm. 

Another example is the term-size norm. It is defined for a functor / of 
arity n by setting each ai to I and c to n. According to the term-size norm, a 
term is instantiated enough only if it is ground. 

In our experimentation we found that in most cases the term-size norm is 
sufficient for showing termination. In some cases the list-size norm is needed. 
There were only few cases in which a general linear norm was needed. In the 
version of TermiLog on the web [74] it is only possible to use the term-size 
or list-size norms, while in the full version there also is a possibility for the 
user to define an appropriate general norm. Note also that the set of queries 
described by a query pattern depends on the norm, so if we prove termination 
of append{b, b, /) with the term-size norm this means that there is termination 
for queries in which the first two arguments are ground, while termination with 
the list-size norm means that there is termination for queries whose first two 
arguments are lists of finite length that may contain variables. 

4.2 The Weighted Rule Graph 

Our first step is to construct from each rule of the program a graph, which 
extracts all the information about argument norms that is in the rule. This graph 
will be used in the construction of the query-mapping pairs. The graph will have 
nodes labeled by the terms that are the arguments of the atoms of the rule. For 
each term we will compute its symbolic linear norm, which is a linear expression 
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in the variables of the term. As mentioned earlier we use the name of a logic 
variable to denote its norm. If nodes A^l and N2 are labeled by terms T1 and T2 
and ||n|| = ||T2|| we will put in the graph an edge between the nodes. Otherwise 
we will compute the difference ||T1|| — ||T2|| . This difference is a linear expression. 
If this expression, when put into normalized form, has non-negative coefficients 
and a positive free coefficient (by ’’free coefficient” we mean the coefficient that 
does not precede a variable, say gq in oq -I- oi A -|- 02 Y) this means that whenever 
the numeric variables will get non-negative norm values (when the respective 
logic variables become instantiated enough through appropriate substitutions) 
the expression will be a positive number. In this case we will put in the graph 
a potential arc, labeled by the normalized norm difference, from N1 to N2. We 
will draw potential arcs as dashed arcs. 

It should be explained what potential arcs are. In the termination proof we 
use the fact that the order induced by the norm on terms that are instantiated 
enough is well-founded (recall that for such terms the norm is a non-negative 
integer). Once we know that the nodes connected by a potential arc are instan- 
tiated enough, we connect them with an arc. However, we will not do this when 
we do not know that the arguments are instantiated enough, because we want 
to be sure that there cannot be an infinite path consisting of arcs. Consider for 
example the program 

int (0) . 

int(s(X)) int(X). 

with the query int(Y) and the term-size norm. From the rule we get the weighted 
rule graph 



int 


s(X) 




11 




t 


int 


X 



However, there is an infinite derivation 
^ int(Y) 

{Y ^ s(Yl)} 

^ int{Yl) 

{Y1 s(Y2)} 

^ int(Y2) 



We now come to the formal definition: 

Definition 4.3. The weighted rule graph associated with a rule has as nodes all 
the argument positions of the atoms in the rule, each labeled by the term filling 
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that argument position. Let N1 and N2 be any nodes in the graph, labeled by 
the terms T1 and T2, respectively. If ||T1|| = ||T2|| then the graph will contain 
an edge between N1 and N2. If the normalized form of ||n|| — ||7’2|| has non- 
negative coefficients and a positive free coefficient then the graph will contain a 
potential arc from N1 to N2 labeled by ||T1|| — ||T2||. 

For example, using the term-size norm, we get for the rule 
ack(s(M) ,s(N) ,A) ack(s(M) ,N,A1) , ack(M,Al,A). 

the weighted rule graph that is shown in the figure. (Note that in order not to 
clutter the figure, we do not usually show edges and arcs that could be deduced 
from other edges and arcs.) 



ack 


g 


,(M) 


s(N) 




ack 


g 

1 

1 


(M) 

!l 




A1 


ack 


M 


Al/^ 


Ay 



4.3 Generation of Basic Query-Mapping Pairs 

To generate basic query-mapping pairs we’ll use 

— Results of the instantiation analysis (see Subsection 4.6). 

— Results of the constraint inference (see Subsection 4.7). 

— Weighted rule graphs for the rules of the program. 

The instantiation analysis and constraint inference use abstract interpreta- 
tion to give information about atoms that follow from the program. 

Instantiation analysis tells us which instantiations we can expect in atoms 
that are logical consequences of a program. For instance for the Ackermann 
program with the term-size norm we get the instantiation patterns 

ack{ie,ie,ie) ack{ie,nie,nie) 

where ie denotes an argument that is instantiated enough with respect to the 
norm and nie denotes an argument that is not instantiated enough. The result 
of the instantiation analysis is a set of instantiation patterns — atoms with a 
predicate that is a predicate of the program and arguments that are either ie or 
nie. Each atom that follows logically from the program is described by one of 
these instantiation patterns. 

Constraint inference tells us which constraints we can expect in atoms that 
are logical consequences of a program. For instance for the Ackermann function 
we get the constraints 

constraint{ack/3, [gt{3, l),gt{3, 2)]) constraint{ack/3, [gt{l, 2),gt{3, 2)]) 
constraint{ack / 3, [gt{3, 2)]) 
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(The notation gt(i,j) means that the i’th argument is greater than the j’th 
argument.) The constraint inference gives us for each predicate of the program 
such a disjunction of conjunctive constraints. A conjunctive constraint has the 
form 

constraint{pred/ ar, list-of -argument-constraints) 

where pred is a predicate of the program, ar is its arity, and the List Of Ar- 
gument Constraints contains basic constraints of the form gt{i,j) or eq{i,j). A 
disjunctive constraint for a predicate is given by a list of conjunctive constraints 
for the predicate. Each atom that follows logically from the program satisfies the 
basic constraints of one of the conjuncts of the disjunctive constraint describ- 
ing its predicate. Since any atom satisfies the empty list of basic constraints, 
performing the constraint inference is optional — if we can prove termination 
without it, it usually is faster. However, there are cases in which it is impossible 
to prove termination without it. For each example in the tables at the end of 
the paper one can see whether constraint inference was used in the termination 
proof by observing if there is a time given in the ’’Constr” column. One can 
see that there are more examples for which it was not used. However in a case 
like quicksort we need constraint inference for partition to deduce that in the 
clause 

quicksort ( [X I Xs] ,Ys) partitionCXs, X, Littles, Bigs) , 

quicksort (Littles, Ls) , 
quicksort (Bigs, Bs) , 
append(Ls, [X|Bs] ,Ys) . 

the norms of the local Littles and Bigs are smaller than the norm of [A|As]. 

In argument mappings we only have edges and arcs and black and white 
nodes. In the weighted rule graph there is more information — there are weighted 
arcs and there are labels for the nodes. We want to deduce argument mappings 
from weighted graphs augmented with information about instantiations and con- 
straints. To do so we need the following. 

Definition 4.4 (Zero- Weight and Positive- Weight Paths). When travers- 
ing a path, an edge can he traversed in both directions and its weight is zero. An 
arc can he traversed only in the direction of the arrow. A weighted arc with a 
label w can he traversed in both directions; in the direction of the arrow its weight 
is w and in the opposite direction its weight is —w. A path has a positive weight 
if either 

— there is at least one arc between adjacent nodes and the normalized expression 
for the sum of the weights has non-negative coefficients 

or 

— there is no arc between adjacent nodes but the normalized expression for the 
sum of the weights has non-negative coefficients and a positive free coeffi- 
cient. 

A path has zero weight if it only has edges and weighted arcs (but no arcs ) and 
the sum of the weights along the path is zero. 
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Definition 4.5 (Inferred Edges and Arcs). Let G be an augmented weighted 
rule graph. We infer a new edge between nodes u and v if there is a zero-weight 
path between these nodes. We infer a new potential arc from node u to node v 
if there is a positive-weight path from u to v. This arc is a real arc if the terms 
labeling u and v are instantiated enough. 

If we know from the query pattern or the instantiation analysis that a certain 
node of the weighted rule graph is black, i.e. its label is instantiated enough, we 
can compute the linear norm of the label and deduce that all variables appearing 
in the expression for the norm are ground. By propagating this information to 
other labels it may be possible to deduce that they are also instantiated enough. 
We call this process Inference of Black Nodes. For example, suppose we have 
the rule 

p(f(X),g(Y)) q(h(X,Y)) . 

and suppose we use the term-size norm. In this case the norms of the terms 
f{X),g{Y),h{X,Y) are, respectively, l-\- X,l-\-Y,2-\- X-\-Y . The weighted rule 
graph is: 

P f(X) g(Y) 

t ^ 

1-kY I /1-kX 

I X 

q h(X,Y) 

Suppose a query p(b, b) is given. Then we know that X and Y must be ground 
and can deduce that the argument of q must be ground too, i.e. the corresponding 
node is black. 

Suppose a query pattern Q is given. Take a rule r : H : — Si, . . . , Sn such 
that the predicate of H is identical to the predicate of Q. We will describe how to 
generate a query pattern corresponding to Si and, only if the predicates of H and 
Si are in the same strongly connected component of the predicate dependency 
graph, a query-mapping pair corresponding to H and Si (1 < f < n). The rule r 
means: to prove H (that is, find substitutions for its variables so that it follows 
logically from the program) you have to prove Si, . . . , Sn- Since we use Prolog’s 
computation rule we proceed from left to right. If we arrive at Si this means 
that we have already proved . . . , Si-i, so we can use for them the results 
of the instantiation analysis and constraint inference. Sometimes several choices 
for instantiations or constraints will be possible. We will pursue all of them. So 
one query may generate several queries and pairs depending on the choices for 
rule, subgoal and the instantiations and constraints for the subgoals prededing 
the chosen subgoal. For each new query generated we repeat the process applied 
to Q. 

We now outline in detail the algorithm which creates a complete set of basic 
query-mapping pairs, so that for each pair of nodes that are direct offspring of 
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each other in an LD-tree for the the given program and a query that matches the 
query pattern Q, there is a basic query-mapping pair in the set corresponding 
to it. 



Put BasicPairs= 0, QueryQueue= [Q], 

While QueryQueueyf 0 do 

{ Remove a query pattern Q1 from QueryQueue and repeat for it the following 
two stage process for every possibility of program rule r, index i of body atom 
of the rule, possible instantiation pattern and possible conjunctive constraint: 
STAGE 1. Augmenting the weighted rule graph: 

1 . Construct the weighted rule graph G of a rule r : H : — Si, . . . , Sn such 
that H has the same predicate as Ql- 

2. Blacken the nodes of the head that are instantiated enough according to the 
query pattern, and add arcs and edges for the constraints that appear in the 
query pattern. 

3. Propagate the information about black nodes to infer, if possible, further 
black nodes. 

4. For each j, 1 < j < t, choose an instantiation pattern, given by the instan- 
tiation analysis, that is compatible with the black nodes of Sj in the sense 
that if an argument in Sj is black then the corresponding argument in the 
instantiation pattern is ie. If a node of Sj is white and the corresponding 
argument in the instantiation pattern is ie, blacken that node and propagate 
the information. 

5. In case constraint inference was performed, insert for each Sj, I < j < 
i, edges and potential arcs in accordance with one of the disjuncts of the 
constraint inferred for its predicate. (Note that if it is possible to prove 
termination without using the constraint inference, it is usually preferrable 
to avoid it.) 

6. Turn all potential arcs or potential weighted arcs to arcs, respectively 
weighted arcs, if their endpoints are black, i.e. instantiated enough. 

7. Add to G all the inferred edges and arcs between nodes of Si. 

8. If the predicate symbols of H and Si are in the same strongly connected 
component of the predicate dependency graph, add to G all the inferred 
edges and arcs between nodes of H and Si. 

STAGE 2: Getting a new query pattern and possibly a new query-mapping pair 
from the augmented weighted rule graph: 

1. Gonvert all weighted arcs to arcs by deleting their labels. 

2. Delete all nodes except those corresponding to argument positions of H and 
S,. 

3. Delete labels of nodes, leaving only their being black or white. 

4. Delete all edges and arcs except for edges that connect existing nodes and 
arcs that connect existing black nodes. 

5. If the predicates of H and Si are in the same strongly connected component 
of the predicate dependency graph put in BasicPairs a query-mapping pair 
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for which the query is Q1 and the mapping is given by the nodes of H and 
Si and the edges and arcs between them. 

6. Generate a new query pattern with the argument positions of Si and the 
edges and arcs between them. If this query pattern has not been investigated 
before, put it in QueryQueue. 

} 



Note that this process must terminate, because there is only a finite number 
of query patterns that can be formed with the predicate symbols of the program. 

Let us return to the Ackermann function program in section 2 and let us use 
the term-size norm. The query pattern corresponding to ack(s(0) ,s(s(0)) ,A) 
is the pattern ack{b,b,f) [] (note that [] denotes the empty constraint list). 
Using the weighted rule graph for the rule 

(ii) ack(s(M) ,0,A) ack(M,s(0) ,A) . 

and inserting the information from the query pattern we get the query-mapping 
pair (a): 



query: ack(b,b,f) [] 

(a) mapping: ack , 

ack 



The ’new’ query generated in this case is the same as the query we started 
with. 

By the way, this is a circular idempotent pair which satisfies the condition 
of Theorem 3.4 — there is an arc between the first argument of the domain and 
the first argument of the range. If the condition had not been satisfied we could 
have halted — our method would not have been able to prove termination. 

If we take the rule 



(iii) ack(s(M) ,s(N) ,A) :- ack(s(M) ,N,A1) , ack(M,Al,A). 



and augment the weighted rule graph on p. 469 with the information from the 
query, we get: 
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For the first subgoal of the rule we get from this graph the query-maping pair 





query: 


ack(b,b,f) 


[] 




(b) 


mapping: 


ack 


1 1 


o 






ack 


• 


o 



and no new query is generated. 

If we do not use the results of the instantiation analysis, we would get from 
the graph, for the second subgoal of the rule, a new query ack{b, /, /). For this 
query we would not have been able to prove termination because it does not 
terminate. However, if we use the results of the instantiation analysis, we know 
that there are only two possible instantiation patterns for the predicate ack : 

ack{ie,ie,ie) ack{ie,nie,nie) 



The only pattern that is compatible with the middle row of our graph is the first 
one. Using this pattern and propagating the information we get the augmented 
weighted rule graph 




From this graph we get for the second subgoal the old query and the query- 
mapping pair 



query: 



ack(b,b,f) 



(c) 



mapping: 



ack 



ack 



No more queries or basic query-mapping pairs can be generated from the 
original query pattern. Now we have to use composition in order to get all 
possible query mapping pairs. 



4.4 Creation of All Query-Mapping Pairs by Composition and the 
Termination Test 

Now we have to compose the basic query-mapping pairs till no more pairs can be 
created. Since the number of query-mapping pairs that can be associated with a 
program is finite this process terminates. Whenever a circular idempotent pair 
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is encountered we check that there is an arc from an argument in the domain 
to the corresponding argument in the range. If this is not so, we halt — our 
method cannot prove termination in this case. If all query-mapping pairs have 
been created, and every circular idempotent pair has an arc from an argument 
in the domain to the corresponding argument in the range, we know that there 
is termination. 

In our example composition of the pairs (a) and (b) gives a new pair (d): 
query: ack(b,b,f) [] 

(d) mapping: ack 

ack 

Composing the pairs (a) and (c) gives the pair (e): 
query: ack(b,b,f) [] 

(e) mapping: ack 

ack 

Composing the pairs (b) and (a) gives the pair (f): 
query: ack(b,b,f) 

(f) mapping: ack 

ack 

No further pairs can be created by composition as the following ’multiplication 
table’ of the pairs shows: 




In this case all the query-mapping pairs are circular and idempotent and 
satisfy the condition of Theorem 3.4, so we get that there is termination for the 
query pattern ack{b, b, /). 
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It should be noted that by the method we outlined of generating basic query- 
mapping pairs and then composing them till no new pairs can be obtained, we 
can assign a pair to each call branch. This pair gives a not necessarily complete 
description of arguments that are instantiated enough and size relations between 
the norms of these arguments. The pair assigned to a call branch in this case 
depends not only on the endpoints of the branch (as was the case in the examples 
in Section 2), but also on the location of the branch in the tree. It is a sound 
approximation to the relation between the calls at the ends of the branch. 

The number of query-mapping pairs that can be formed with the predicate 
symbols of a program is exponential in their arities, and there are cases in which 
the number of query-mapping pairs relevant to a program and query is large. The 
following result about complexity is relevant. In [41] size-change graphs are used 
to prove termination of functional programs. These graphs are the counterpart 
of our query-mapping pairs in the simpler context of functional programming, 
where all the problems connected with instantiation fall away. It is proved there 
that the analysis is PSPACE hard. However, if the maximal arity of a predicate, 
the maximal number of different variables in a clause and the maximal number 
of subgoals in a clause is limited by a relatively small number, we can expect 
our method to behave reasonably, as illustrated by our experiments. 



4.5 Possible Optimizations 

The optimization of Theorem 3.5, that only query-mapping pairs in which the 
predicates of the domain and range belong to the same strongly connected com- 
ponent of the predicate dependency graph need be considered, was implemented 
in the TermiLog system from the beginning. 

Another optimization, implemented recently, is the following. Let us define 

Definition 4.6 (Weaker Version). Suppose that two query-mapping pairs P\ 
and P 2 have identical queries and identical ranges of the mappings, and that 
every edge in Pi is also included in P 2 and every arc in P\ is also included in 
P 2 - Then we will say that P\ is a weaker version of P 2 - 

Suppose that we discover that a query-mapping pair Pi is a weaker version of a 
pair P 2 - Then the pair P 2 can be discarded, since in the termination proof the 
weaker Pi can be used in any place P 2 is used, and if termination can be proved, 
the edges and arcs of Pi must be sufficient for the proof. 

In the above example of the Ackermann function the basic query-mapping 
pairs are (a), (b) and (c). The pair (c) is a weaker version of the pair (a), so 
we need only use composition for the pairs (b) and (c). By composing (b) and 
(c) we get the pair (f). The pair (f) is a weaker version of the pair (c), so it is 
enough to consider all the compositions of (b) and (f). This does not give rise 
to further pairs. Since the pairs (b) and (f) satisfy the condition of Theorem 3.4 
we get termination of the query pattern ack{b, b, /). 
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4.6 Instantiation Analysis 

Instantiation analysis of a given program is a preliminary step for our algo- 
rithm. We prove termination by using norm inequalities between sufficiently 
instantiated terms. The method used for the instantiation analysis is abstract 
interpretation as outlined in [24] . 

Definition 4.7 (Galois Connection). Given two partially ordered sets 
and P**(^**) a Galois connection between them is a pair of maps 

a: — > P^ 

7 : — > P^ 

such that 

Vp*” € P^ : Vp** G P^ : o(p*’) p** p^ 

Both for the instantiation analysis and the constraint inference we use the 
following Theorem from [24]: 

Theorem 4.1 (Fixpoint Abstraction). If and P'^{^'^) are complete 

lattices^ , there is a Galois connection between P^ and P^ given by a and 7 , and 
is a monotone map from P^ to P^ , then 

a{lfp{P'’)) IfpiaoP'’ oj). 

Groundness analysis has been handled by several authors (cf. for example 
[24], [20]). Here we extend the ideas used for groundness analysis to the more 
general case of being instantiated enough for an arbitrary symbolic norm we have 
chosen. An argument is instantiated enough if its symbolic norm is an integer. 

The usual Herbrand base whose atoms are all ground is not useful in this 
case and we have to extend it. We define a semantics which is similar, although 
not identical, to the c-semantics of [34,15]. As extended Herbrand base B we 
take all atoms that can be built from the constant, function, and predicate 
symbols of the program P and variables from an infinite set {Al, A2, . . .}. The 
difference from the s-semantics and c-semantics is that there equivalence classes 
of atoms modulo variance are taken, while we take atoms with variables from 
the infinite set. The redundancy does not disturb us, since we are interested in 
the abstraction, which is finite. We define the immediate consequence operator 
Tp for any / C in the following way: 

Tp{I) = {A G B : A ^ Ai , . . . , A„ is an instance of a clause in P 

and {Ai, . . . , A„} C /}. 

The least fixed point of Tp consists exactly of those elements of B that have an 
SLD-refutation with the identity substitution as computed answer. We shall call 

® A complete lattice is a partially ordered set such that every subset has a least upper 
bound and a greatest lower bound (cf. [48]). 




478 



Naomi Lindenstrauss, Yehoshua Sagiv, and Alexander Serebrenik 



it the minimal extended Herbrand model for the program. Now consider 

the complete lattice of all subsets of B, with the inclusion order. We define 
a Galois connection between P^ and P^, the power set (i.e. the set of subsets) 
of the set of all atoms whose predicate symbol is one of the predicates of the 
program, and whose arguments are ie, representing an argument that is instan- 
tiated enough for its symbolic norm to be an integer, and nie, representing an 
argument which is not instantiated enough. (In the case of groundness analysis 
usually g for ground and ng for non — ground are used instead of our ie and 
nie.) We define 
for a term T 

a{T) = ie if the symbolic norm of T is an integer, 

Of(T) = nie otherwise, 
for predicate symbol p 

a(p(Ti, . . . ,T„)) = p(aTi, . . .,aT„), 
and for a set of atoms S 
a{S) = {a(s)|s € S}. 

For p'^ G P^ we define 



7 (p**) = {Ag B : a{A) G p**}, 

that is, 7 (p**) consists of all the atoms in B that have the instantiation patterns 
included in p'^. These a and 7 determine a Galois connection between and 
P“. 

Tp is a monotone map from P^ to P^ , hence we get from the Fixpoint Ab- 
straction Theorem that 



00 00 

Since P'^ is finite, the right hand side can be computed by a finite number of 
steps. 

Define T** = a o Tp o 7 . 

Algorithm for the Computation of the least fixed point of 

For each clause in the program, supposing it contains n different variables, 
we create all 2" instances of substituting ie and nie for each of them. For each 
instance of a clause we then substitute for the arguments of the predicates ie or 
nie in accordance to whether or not they are instantiated enough for the norm 
(this can be decided by replacing the nie's in the clause instance with a variable, 
and then computing norms — the result should be ie if the computed norm is an 
integer and nie otherwise). This gives us the clauses of a new program, that has 
only ground arguments, and hence its success set is finite and can be computed as 
the least fixed point of the immediate consequence operator. But this is exactly 
the least fixed point of T**, because the new program describes its action on the 
atoms of PK 
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Consider for example the append program 

Example ^.1. 
append ( [] ,X,X) . 

appendC [HiX] ,Y, [HiZ] ) append (X, Y, Z) . 

□ 

with the term-size norm. A term is instantiated enough with respect to this norm 
if and only if it is ground. Hence, we use in this example g and ng instead of ie 
and nie. From the clause 

appendC [] , Y, Y) . 

we get for the new program 

append (g,g,g) . 
append (g,ng,ng) . 

In the clause 

appendC [HiX] ,Y, [HiZ] ) append (X, Y, Z) . 

we have to substitute all 2^ possibilities of g and ng for its variables. For instance 
the substitution 

H e^ng, X g, Y g, Z g. 

would give us 

appendC [ngig] ,g, [ngig] ) append Cg, g, g) . 

which would give us the rule for T** 
append Cng , g , ng) : - append Cg , g , g) . 

We get 

T“*(0) = {append{g,g,g),append{g,ng,ng)} 

T“*^(0) = {append{g, g, g) , append{g ,ng ,ng) , append{ng , g,ng), appending ,ng ,ng)} 

OO 

t“^( 0) = r“^(0) (Jt“*( 0) = r“^(0) 

i=l 

On the other hand if we used the list-size norm we would get, for example, 
from the above rule and the substitution 

H 1 -^- nie, X ie, Y ie, Z ie. 

the instance 

appendC [nie I ie] , ie , [nie , ie] ) appendCie , ie , ie) . 

and hence, since for the list-size norm ||[iY|T]|| = 1-|- ||r||, the rule 

appendCie , ie , ie) appendCie , ie , ie) . 
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In this case 



T\%) = {append{ie,ie,ie),append{ie,nie,nie)} 

OO 

|Jr«*(0) = t“^(0) = t“(0) 

i=l 

It should be noted that in the conclusion of the Fixpoint Abstraction The- 
orem we have inclusion, so this process gives us a superset of instantiations 
that may occur. To give an example where the inclusion is proper consider the 
following example: 

Example J^.2. 
p(X) g(X), h(X). 
g(a) . 
h(b) . 

□ 

with the term-size norm. In this case we get the instantiation p(ie) although 
there is no solution for p{X). 

The way we will use the instantiation analysis is as follows. Suppose we 
use the term-size norm and are given a subgoal to the append program with 
certain bindings, say append{b, b, /), where b denotes a ground argument and / 
denotes an argument for which we do not know if it is ground. Then we know 
from the instantiation analysis what bindings cannot result for the arguments of 
this subgoal if it terminates successfully. For instance in the above case, we can 
infer that the third argument will become ground, because the only instantiation 
pattern in which the first two arguments are ground is append{g , g , g) . 

4.7 The Inference of Constraints 

Inference of monotonicity constraints was treated in [16] and inference of in- 
equality constraints was treated in [17]. Monotonicity constraints have the ad- 
vantage that once an atom is given, there is only a finite number of possibil- 
ities for them, but they are weak. Inequality constraints, say something like 
jjargl]] > |jarg2|| -|- n where n is a number, give more information, but there 
are infinitely many possibilities for them. The algorithm for constraint inference 
proposed here tries to get the best of both worlds — in the derivation step it 
uses quantitative norm computations, while its conclusions are formulated as 
monotonicity constraints. This enables us, for instance, to show termination of 
quicksort. 

We again use abstract interpretation, only things are more complicated. = 
2® and Tp are as before. The set C of abstractions of elements of B consists of 
mixed graphs, whose nodes are the argument positions of some predicate in the 
program and whose edges and arcs form a consistent set of constraints between 
the nodes. Such graphs can be denoted by a pair consisting of the predicate 
(with arity) and the list of edges and arcs. An edge connecting the Fth and j’th 
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nodes will be given by eq{i,j) if z < j and by eq{j,i) if z > j, and an arc going 
from the z’th node to the j’th node will be given by gt{i,j) (remember that our 
arcs go from a node of larger norm to a node of smaller norm) . We will call such 
graphs conjunctive predicate constraints. 

Consider 2^ , whose elements are sets of conjunctive constraints. An element 
of 2^ is interpreted as the disjunction of the conjunctive constraints of which it 
consists. For ci,C 2 G 2^ define an equivalence ci ~ C 2 iff the set of all ground 
atoms that satisfy a constraint in ci is equal to the set of all ground atoms that 
satisfy a constraint in C 2 . For instance 

{{p/2,[])}^{{p/2,[gt{2,l)]), (p/ 2 , Ml, 2 )]), (p/ 2 , ^ 1 , 2 )])} 

Let = 2^/,^, that is the set of equivalence classes of elements of 2^ . For an 
element c G 2^ we denote by its equivalence class. As a we take the map 
that assigns to each atom in B the unique element in C that has the constraints 
that can be inferred for the atom, using the particular symbolic norm we have 
chosen. Given an atom A G B we compute the symbolic norms of its arguments. 
If the predicate of A is p/n then a{A) will consist of the pair (p/n. List), where 
List contains elements eq{i,j) (z < j) if the symbolic norm of the z’th argument 
of A equals the norm of its j’th argument, and contains gt(i,j) if the normalized 
form of the difference between the norms of the z’th and j’th arguments of A, 
which is a linear expression in the variables, has non-negative coefficients and 
the constant term is positive (say something like 1 -I- A or 2 -|- 6 A -|- 5A). Note 
that we assign to each atom one conjunctive predicate constraint. For instance, 
with the term-size or list-size norms, 

a{append(W,Y,Y)) = {{append/3, [e< 7 ( 2 , 3 )])} 

a{p{[H\X],X, Y, [H\Y],X)) = {(p/5, [ 5 ^( 1 , 2),gt{4, 3),gt{l, 5),eq{2, 5)])} 

(We could have chosen another definition of a, which would have assigned to an 
atom in ,8 a set of several elements in C.) We extend a to a map from 2® to 
in the obvious way. 

The order in P^ is the inclusion order. In P® we define the order p\ P 2 iff 
the set of all ground atoms that satisfy one of the constraints in p{ is included in 
the set of all ground atoms that satisfy one of the constraints in p\ ■ For example 

|(p/3, [eq{l,2), gt{3,2), gt{3,l)])}^ l(p/3, [pt(3, 2)])}.^ 

|(p/3, [eq{l, 2)]), (p/3, [gt{2, 1)])}.. |(p/3, [])}.. 

For c G C we define 

7 (c) = {A G B : {a(A)} c} 

and extend the definition to P** in the obvious way. It is easy to see that we get a 
Galois connection and hence, by the Fixpoint Abstraction Theorem, if we define 
ptt = a o Tp o 7 , 



a{lfp{Tp)) ;/p(T») 
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Now we will define an operator 

T : 2 '^/^ — > 2 '^/^ 

that approximates from above in the sense that for each I G P'^ we have 
T\I) r(/). 

Definition 4.8 (Inferred Head Constraint). Given I £ 2^ and a weighted 
rule graph we insert for the nodes of each body atom in the weighted rule graph 
the edges and (potential) arcs of one relevant conjunctive constraint from I . With 
the help of the weighted rule graph we infer the edges and (potential) arcs for the 
nodes of the head atom. The resulting conjunctive constraint is an inferred head 
constraint. 

We define 

r(0) = {a(J^) : F is a fact of the program} 

For a non-empty set I £ 2^ (actually we are dealing with equivalence classes) 
we define 



t(/) = I U {c : c is an inferred head constraint relative to I 

and the weighted rule graph of some program rule}. 

Since t is monotone and C is finite the least fixed point of r exists and the 
computation always terminates. 

Algorithm for the inference of constraints — computation of the 
least fixed point of r: 

OldJJonstraints = 0 

N ew -Generation JO onstraints = {a{F) : F is a fact of the program} 

While N ew JGenerationJO onstraints 7 ^ 0 do 

1. Let Derived be all those constraints that can be inferred by taking the weighted 
rule graph for some program rule and inserting constraints for the subgoals ac- 
cording to elements of 

Old-Constraints and New -Generation-Constraints, 

taking care that at least one constraint from the latter is used. 

2. Old-Constraints — Old-Constraints (J N ew -Generation-C onstraints 

3. New -Generation-C onstraints = Derived — Old-Constraints 

Return Old-Constraints. 

For each I £ we have T'^{I) t{I), because by our methods we may not 

be inferring all the constraints and hence, because the least fixed points exist, 

lfp(T») A# lfp{T) 

To give an example where T^(I) differs from t{I) take the append program 
(Example 4.1). 
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t(0) = {(oppend/3, [eg(2,3)])} 

T“‘(0) = {{append/‘i, [eg(l, 2), eq{2, 3), eg(l, 3)]), 

{append/i, [gt{2, 1), 5^(3, 1), eg(2, 3)])} 

Using the Fixpoint Abstraction Theorem we get 

a{lfp{Tp)) lfp{T^) lfp{r) 

This means that every atom in the success set of the program satisfies at least 
one of the conjunctive predicate constraints in lfp(r). 

Consider for example the append program (Example 4.1). Then 

t(0) = {{append/i, [eg(2,3)])} 

The weighted rule graph for the second clause of the program is 

.( 1 ) [^^ 1 ^] ^ [H\Z] 

append^ ’ t t ? 

2+H 2+H 

append^^^ ; 

X Y Z 

Putting for the body nodes (the ones with append^^^) the one constraint we 
got thus far we get the constraint {append/3, [gt(3, 2)]),so we get 

r2(0) = {{append/3, [eq{2,3)]),{append/3, [5t(3,2)])} 

Applying r to this new set we realize that no new constraint can be derived, so 
we have found a fixed point for t. So we have inferred a disjunctive constraint 
for append consisting of two conjunctive constraints. 

We found it advantageous to keep track of whether a constraint is obtained 
just once during the inference or more than once. Consider for example the 
program mergesort. 

mergesortC [],[]). 
mergesort ( [X] , [X] ) . 

mergesortC [X,Y|Xs] ,Ys) splitC [X,Y|Xs] ,Xls,X2s) , 

mergesort(Xls,Yls) , mergesort (X2s, Y2s) , merge(Yls,Y2s,Ys) . 

splitC [],[],[]). 

splitC [XiXs] , [XlYs] ,Zs) split(Xs,Zs,Ys) . 

merge ( [] ,Xs,Xs) . 
merge (Xs, [] ,Xs) . 

merge([X|Xs] , [YlYs] , [XlZs]) X=<Y, mergeCXs, [Y|Ys] ,Zs) . 
merge([X|Xs] , [YlYs] , [YlZs]) X>Y, mergeC [X|Xs] ,Ys,Zs) . 
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If we try to infer constraints for the predicate split we get after the first appli- 
cation of to 0 the constraint [eq{2, 1), eq{3, 1)], after a second application the 
constraint [eq{2, l),gt(l,3)], and for all further applications only the constraint 
[gt{l,2), gt{l,3)]. The disjunction of these three constraints is not sufficient for 
inferring the termination of mergesort by the query-mapping pairs method, 
while the last constraint by itself is. In this case we can separate by unfolding 
the predicate split into three predicates and obtain the program mergesort 1, 

mergesort ( [],[]). 
mergesort ( [X] , [X] ) . 

mergesortC [X,Y|Xs] ,Ys) split2( [X,Y|Xs] ,Xls,X2s) , 

mergesort(Xls,Yls) , mergesort (X2s, Y2s) , merge(Yls,Y2s,Ys) . 

split(Xs,Ys,Zs) splitO(Xs,Ys,Zs) . 

split(Xs,Ys,Zs) splitl(Xs,Ys,Zs) . 

split(Xs,Ys,Zs) split2(Xs,Ys,Zs) . 

splitOC [],[],[]). 
splitl([X] , [X] , []) . 

split2( [X,Y|Xs] , [XlYs] , [YlZs) split(Xs,Ys,Zs) . 

merge ( [] ,Xs,Xs) . 
merge (Xs, [] ,Xs) . 

merge ( [X I Xs] , [Y I s] , [X I Zs] ) X=<Y, mergeCXs, [Y|Ys] ,Zs) . 

merge([X|Xs] , [YlYs] , [YlZs]) X>Y, mergeC [X|Xs] ,Ys,Zs) . 

for which we can prove termination by the query-mapping pairs method. More 
details on this unfolding-based technique can be found in [47]. 

5 The Implementation and Experimental Evaluation 

5.1 The Implementation 

The TermiLog system [46]), is based on the approach outlined in this paper. 
The version available on the web ([74]) can use the term-size norm and the list- 
size norm. In the full version (that is an off-line version available on request) 
it is also possible for the user to define other linear norms. Moreover there is a 
possibility to handle programs that use modules. First the module is analyzed 
and the results are put in a file, which is then used when the program calling 
the module is analyzed. For handling larger programs there is an option of using 
a ’big version’, that infers the subqueries that will be created from the original 
query by using only the results of the instantiation analysis, and then checking 
only those subqueries that have a recursive predicate. 

Predefined predicates can be handled if their instantiation patterns are sup- 
plied to the system (recall that the instantiation patterns depend on the norm) . 
In the current implementation, the instantiation patterns of most predefined 
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predicates are already included in the system. Constraints of predefined predi- 
cates may also be included, but this is not necessary. Operator declarations that 
appear in a given program are asserted as facts of the system. 

Control predicates have to be dealt with in a special way, since they are not 
part of the declarative semantics of logic programs. Cuts are simply ignored. Of 
course, if the semantics of cut is needed to show termination, then our system 
will not be able to determine that the given program terminates. 

Other control predicates are handled by transforming the given program 
into a new program that does not have these control predicates, such that if 
termination can be shown for the new program, then the original program also 
terminates. 

If a negated subgoal appears in a clause, say 

A B, C, \+D, E,F. 

then the above clause is replaced with the following two clauses: 

A B, C, D. 

A B, C, E, F. 

If several negations appear in the same clause, they can be handled by repeated 
application of the above transformation. 

There is one point here that should be taken into account. The clause 

A B, C, D 

should not be used in the instantiation analysis and constraint inference, since 
for A to succeed D should fail, that is D cannot provide any bindings. 

Disjunction, “if-then” and “if-then-else” are handled in an analogous way. 
We will mention just one little example illustrating the usefullness of the 
system. 

Take the following predicate sublist, whose intended function is to find 
whether one list is a sublist of another: 

sublist(X,Y) :- append (XI ,X,X2) , append(X2,X3, Y) . 
append ( [] ,X,X) . 

appendC [HiX] ,Y, [HiZ] ) :- append (X, Y, Z) . 

This is probably the most natural way to express the sublist relation (cf. [77]). 
However, if the query pattern sublist(b, b) is given, our system will say that there 
may be non-termination because the first append gets the binding append{f, b, /), 
and indeed a query like sublist{[l], [2, 3]) does not terminate. 

If we switch the subgoals (as it is done in [73]) 

sublist(X,Y) :- append (X2, X3, Y) , append (XI ,X,X2) . 

our system shows that the query sublist{b, b) terminates. 

This example shows how a user who writes a program according to the logic 
of a problem without thinking about the execution mechanism of Prolog can 
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benefit from our system. This is especially true for predicates that succeed on 
the first call but go into an infinite loop when backtracked into. Switching the 
order of the subgoals does not change the logic of the program, but it may, as it 
does in this example, improve the termination behaviour. 

5.2 Comparison with Other Automatic Approaches to Termination 
Analysis 

When our system was developed, the CLP(R) package of SICStus Prolog [68] 
was not yet available, so the constraints we inferred, within the framework of 
standard Prolog, were only equality or monotonicity relations between argu- 
ments (this is in contrast with later systems like TerminWeb and cTI, that use 
CLP(R)). Therefore our system cannot handle the following program from [58]: 

perm( [],[]). 

perm(L, [H|T] ) append (V, [Hi U] ,L) , 
append (V,U,W) , 
perm(W,T) . 

appendC [] , Y, Y) . 

appendC [HiX] ,Y, [HiZ] ) append (X, Y, Z) . 

The query perm{b, f) terminates for this program, but our system cannot show 
termination. To prove termination we need the fact that for append the sum 
of the term-sizes of the first and second arguments equals the term-size of the 
third. 

We can transform the definition of append to a form in which our system, 
which only infers term-size equality for arguments, will be able to infer the above 
linear equality: 

perm( [],[]). 

perm(L, [H|T] ) :- appendl(p(V, [H|U] ) ,s(s(L))) , 
appendl (p(V,U) ,s(s(W))) , 
perm(W,T) . 

appendl (p( [] ,Y) ,s(s(Y))) . 

appendl (p( [H|X] ,Y) ,s(s( [H|Z] ))) :- appendl (p(X, Y) ,s(s(Z))) . 

(The functors p and s are used so that, for atoms with predicate appendl in the 
success set, the norms of the two arguments will be equal.) For the transformed 
program, which is clearly equivalent to the original one, our system easily proves 
termination of perm{b, /). 

In [9] argument size relationships are inferred with CLP(R). In TerminWeb's 
old version (cf. [23]) ideas similar to our query-mapping pairs method are aug- 
mented with the analysis of [9], so termination of the example can be proved. It 
also can be proved with cTI. 

cTI is a bottom-up constraint-based inference tool for Prolog (cf. [51,50]). 
This tool is goal independent — it infers sufficient universal left-termination 
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conditions for all the predicates in the given program (the recent implementation 
of TerminWeb does so too — cf. [37]). The new version of cTI, that uses the 
Parma Polyhedra Library (cf. [7]), is very efficient. However, it cannot prove 
termination for the program vangelder with query q{b,b) (see Table 4), which 
TermiLog can do. cTI uses the term-size norm only (but as we have seen this is 
the most useful of the linear norms). 

In contrast with the termination inference systems, TermiLog is goal directed, 
and in case it suspects non-termination it produces to the user the circular 
idempotent query-mapping pair that did not satisfy the termination test, thus 
showing him which predicate he should suspect. 

The constraint based approach [27] is implemented but not publicly available. 
It starts with a general level mapping and general linear norm and infers the 
coefficients. This is efficient when it can be done but may run into trouble when 
there are nested expressions (because then we get products of the coefficients) . It 
cannot handle the program of Ackermann’s function (that TermiLog can handle) 
because in that case we need argument-by-argument comparisons instead of 
weighted sums. 

The following two termination analyzers fall in a slightly different category, 
as both of them impose requirements on the logic programs handled. 

TALP ([55]) is a publicly available tool that proves termination of logic pro- 
grams by transforming them into term-rewriting systems. It requires the pro- 
gram and query to be well-moded. This requirement follows from the fact that 
term-rewriting systems have a clear distinction between input and output. This 
strongly differs from logic programs, where the same predicate can be used in 
different modes. 

The compiler of the Mercury programming language contains a termination 
checker. This is described in [72] and its times are compared to TermiLog's 
times for the benchmarks in our tables. The Mercury termination checker is 
usually faster than TermiLog, but one must remember that in Mercury the text 
of the program being checked contains mode informations as part of the language 
requirements. Another termination checker for Mercury is described in [35]. 



5.3 Experimental Evaluation 

The technique presented here was implemented in the TermiLog system [46]. 
The system has been implemented in SICStus Prolog [68] . 

The 1996 version of the system has been applied to well over 100 logic pro- 
grams, some quite long. The detailed experimental results can be found in [43,45]. 
It should be noted that the times for the system as it was written in 1996 are 
now an order of magnitude faster because of improvements in computers and in 
SICStus Prolog. 

We now improved the efficiency of the system in two ways: 

1. Instead of checking for the presence of a forward positive cycle in circular 
pairs (as was done in the old version) we now only check for an arc between 
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corresponding arguments in circular idempotent pairs. It turns out that this 
not only adds to the efficiency of the system but also to its power (cf. [29]). 

2. We implemented the optimization of Section 4.5. 

We tested the improved system on the benchmarks we had and the results are 
reported here. First, classical benchmarks for termination of symbolic compu- 
tations were studied. Tables 1 and 2 summarise the performance of the system 
on the programming examples of [5,30], and [58]. Next (Table 3), we applied 
our technique to study termination of benchmarks that were originally used for 
study of parallelism in logic programming [18,36,53,19]. Benchmarks in this col- 
lection go back to Ramkumar and Kale [61] (occur), Santos-Costa et al. [67] 
(zebra), Tick [75] (bid) and Warren (palin, qplan, warplan). A complete descrip- 
tion of this collection may be found in [53]. Finally (Table 4), we have collected 
a number of programs from different sources, including Prolog textbooks [1,73] 
and research papers [2,4,77,83,82]. Termination of most of the examples con- 
sidered, with clearly indicated exceptions such as serialize, was established by 
using the term-size norm. Note also that there are cases in which we can establish 
termination of a query pattern both with the term-size norm and the list-size 
norm. Since being instantiated enough depends on the norm, the kind of queries 
corresponding to a query pattern will depend in these cases on the norm. 

Note that all the benchmarks referred to in the tables are available, in com- 
pressed and uncompressed form, on the homepage [42] . 

In the tables the following abbreviations are used: 

— Re/ denotes a reference to the paper from which the program is taken. 

— F and R denote, respectively, the numbers of facts and rules in the program. 

— Inst, Constr and Prs denote, respectively, the times in seconds for the in- 
stantiation analysis, constraint inference and construction of pairs. Since 
constraint inference is an optional step the corresponding column is often 
empty, meaning that it has not been applied. Observe that for some exam- 
ples the time needed to perform the corresponding step of the analysis was 
too small to be measured exactly. These cases are indicated by 0.00. 

— Pr^ is the number of query-mapping pairs constructed. 

— A is the answer given by the system. It is either T, meaning that queries 
matching the query pattern terminate, or N , meaning that there may be 
queries having this pattern that don’t terminate. For N we add an indication 
if there really is non-termination, denoted by N+, or if there is termination 
and the system is not strong enough to see it, denoted by N-. 

— Rem means a remark. For remarks we use the following abbreviations: 

• mod means the module feature was used for the named file. 

• after ^.7 transf in the mergesort example means that the transformation 
outlined at the end of Subsection 4.7 was used. 

• mem means that there are memory problems when trying to handle the 
program. 

• big means that we used the version for big programs. If the big version 
has been used, measurements in the Constr-coinmn present time spent 
on subquery generation and constraint inference. 
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Table 1. Examples from [5,30] 



Program 


F R Query 


Inst Constr Prs Pr^ 


A 


Rem 






Examples 


of [5] 








append 


1 


1 app(b,f,f) 


0.00 




0.01 


1 


T 








app(f,f,b) 






0.01 


1 


T 




curry 


1 


4 type(b,b,f) 


0.03 




0.36 


15 


T 








general norm: list-size 


for lists. 










term-size otherwise 








dc^chema 


0 


2 dcsolve(b,f) 


0.01 




0.03 


1 


T 


using dcjmod 


fold 


2 


1 fold(b,b,f) 


0.00 




0.02 


1 


T 




gtsolve 


0 


1 gtsolve(b,f) 


0.00 




0.00 


0 


T 


using gtjmod 


list 


1 


1 list(b) 


0.00 




0.00 


1 


T 




Ite 


2 


2 goal 


0.00 




0.01 


2 


T 




map 


2 


1 map(b,f) 


0.00 




0.01 


1 


T 




member 


1 


1 member(f,b) 


0.00 




0.01 


1 


T 




mergesort 


5 


4 mergesort(b,f) 










N- 




mergesort 


5 


4 mergesort(b,f) 






0.90 


9 


T 


after 4.7 transf 


naive_rev 


2 


2 reverse (b,f) 


0.00 




0.01 


2 


T 




ordered 


2 


1 ordered(b) 


0.00 




0.01 


1 


T 




overlap 


1 


3 overlap(b,b) 


0.00 




0.01 


2 


T 




permutation 2 


2 perm(b,f) 


0.01 


0.11 






N- 




quicksort 


3 


4 qs(b,f) 


0.01 


5.23 


5.06 


6 


T 




select 


1 


1 select(f,b,f) 


0.00 




0.01 


1 


T 




subset 


2 


2 subset(b,b) 


0.00 




0.02 


2 


T 








subset (f,b) 










N-b 




sum 


1 


1 sum(f,b,f) 


0.00 




0.01 


1 


T 








sum(f,f,b) 






0.01 


1 


T 








Examples 


of [30] 








append 


1 


1 append(b,f,f) 


0.00 




0.01 


1 


T 








append(f,f,b) 






0.01 


1 


T 








append(f,b,f) 










N-b 




bool 


2 


4 dis(b) 


0.00 




0.02 


5 


T 








con(b) 






0.02 


5 


T 




duplicate 


1 


1 duplicate(b,f) 


0.00 




0.00 


1 


T 




merge 


2 


2 merge(b,b,f) 


0.00 




0.06 


3 


T 




permute 


2 


2 permute(b,f) 


0.01 


0.02 


0.01 


2 


T 




reverse 


1 


1 reverse(b,f,b) 


0.00 




0.01 


1 


T 




sum 


1 


1 sum(b,b,f) 


0.00 




0.02 


1 


T 





• no rec means that there is no recursion in the program. Termination can 
be, therefore, established trivially. 

• cannot means that it is clear our methods cannot handle the program. 
For instance, this is the case if the benchmark reads input and, hence, 
its termination depends on the presence of the end-of-file. Another case 
is programs that include assert. 
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Table 2. Examples from [58] 



Program 


Ref 


F R Query 


Inst Constr 


Prs Pr# 


A 


append 


1.1 


1 


1 


append(b,f,f) 


0.00 




0.01 


1 


T 










append(f,f,b) 






0.01 


1 


T 


perm 


1.2 


2 


2 


perm(b,f) 


0.01 


0.11 






N- 


perm_t 




2 


2 


perm(b,f) 


0.02 


0.03 


0.04 


3 


T 


transitivity 2.3.1 


2 


1 


P(f,b) 


0.00 








N+ 


lelJist 


3.5.6 


3 


2 


P(f) 


0.00 








N+ 


appends 


4.0.1 


1 


2 


append3(b,b,b,f) 0.01 




0.01 


1 


T 










append3(f,b,b,b) 










N+ 


merge 


4.4.3 


2 


2 


merge(b,b,f) 


0.00 




0.07 


3 


T 


perm_a 


4.4.6a 


3 


2 


perm(b,f) 


0.00 




0.02 


2 


T 


arithmetic 


4.5.2 


1 


4 


s(b,f) 


0.00 








N+ 


loops 


4.5.3a 


1 


1 


P(b) 


0.00 








N+ 




4.5.3b 


2 


1 


goal=p(X),q(X) 


0.00 








N+ 




4.5.3c. 


2 


1 


goal=p(X),q(X) 


0.00 








N+ 


turing 


5.2.2 


1 


6 


turing(b,b,b,f) 


0.52 








N+ 


quicksort 


6.1.1 


3 


4 


qsort(b,f) 


0.01 


5.38 


5.84 


6 


T 


mult 


7.2.9 


2 


2 


mult(b,b,f) 


0.01 




0.02 


2 


T 


reach 1 


7.6.2a 


1 


3 


reach(b,b,b) 


0.01 








N+ 


reach2 


7.6.2b 


1 


3 


reach(b,b,b,b) 


0.01 








N- 


reachS 


7.6.2c 


2 


4 


reach(b,b,b,b) 


0.01 


0.09 


0.23 


6 


T 


mergesortl 


8.2.1 


5 


4 


mergesort(b,f) 










N- 


mergesortl 


8.2.1 


5 


4 


mergesort(b,f) 






after 4.7 transf 0.90 


9 


T 


mergesort2 


8.2.1a 


5 


4 


mergesort(b,f) 


0.02 


0.79 


0.52 


6 


T 


mergesort_t 




6 


7 


mergesort(b,f) 


0.04 


0.53 


0.15 


9 


T 


minsort 


8.3.1 


4 


6 


minsort (b,f) 


0.01 


0.15 






N- 


minsortl 


8.3.1a 


3 


6 


minsort (b,f) 


0.00 


0.09 


0.11 


5 


T 


evenodd 


8.4.1 


1 


2 


even(b) 


0.00 




0.01 


4 


T 










odd(b) 






0.01 


4 


T 


parser 


8.4.2 


3 


6 


e(b,f) 


0.00 


0.03 


0.10 


14 


T 



• succ means transformation of integers to successor notation was applied. 

• = means that the equalities elimination transformation was used. This 
transformation performs the unifications given by equalities like X = 
Expression and thus reduces the number of variables. The zebra example 
shows how useful it can be. In this example the number of variables in the 
first clause is reduced from 25 to 15, thus speeding up the analysis very 
much. However, the transformation is not safe, as the following example 
shows: 

p(X) loop(X), X=a. 
loop(b) loop(b). 

Here p{X) does not terminate, while after the transformation it does. 
In the full system the user can apply this transformation on his own 
responsibility. 
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Table 3. Examples from [18] 



Prog 


F 


R Query 


Inst 


Constr 


Prs Pr# 


A 


Rem 


aiakl 


3 


10 


init_vars(b,b,f,f) 


0.27 


24.11 


215 


T 




arm 


101 76 


go(b) 


84.43 








N+ 




bid 


24 


26 


bid(b,f,f,f) 


0.7 




0.37 


7 


T 




boyer 


63 


73 


tautology(b) 


0.23 








N- 




browse 


4 


25 


main 


2.25 








N- 




deriv 


2 


16 


d(b,b,f) 


0.03 




0.23 


1 


T 




fib_t 


2 


4 


fib(b,f) 


0.00 




0.06 


4 


T 


succ 


grammar 


12 


4 












T 


no rec 


hanoiapp juc 2 


2 


shanoi(b,b,b,b,f) 


0.06 




0.72 


7 


T 


succ 


mmatrix 


7 


8 


mmultiply(b,b,f) 


0.02 




0.04 


3 


T 










trans_m(b,f) 










N+ 




money 


6 


8 


money(f,f,f, 


0.35 


0.93 


0.85 


2 


T 














0.63+0.02 


0.02 


2 


T 


big 


occur 


3 


6 


occur all(b,b,f) 


0.01 




0.06 


3 


T 




peephole 


72 


62 


poptl(b,f) 












mem 


progeom 


4 


14 


pds(b,f) 


0.08 


5.1 






N 




qplan 


63 


85 


qplan(b,f) 


73.26 








N 




qsortapp 


3 


4 


qsort(b,f) 


0.01 


5.55 


4.97 


6 


T 




query 


50 


2 












T 


no rec 


rdtok 


7 


48 














cannot 


read 


15 


73 














cannot 


serialize 


5 


9 


serialize0(b,f) 


0.04 


2.87 


2.64 


8 


T 










general norm: term-size except 
















\\pair{X,Y)\\ = 


1 + 11^11 










tak 


0 


3 














cannot 


tictactoe 


26 


43 














cannot 


warplan 


43 


55 














mem 


zebra 


14 


4 


zebra(f,f,f,f,f,f,f) 


1440 


5.36+0.01 


0.02 


4 


T 


big 










0.96 


9.57 


0.76 


2 


T 


== 












0.68+0.01 


0.03 


2 


T 


= and big 


zebra.pt 


2 


3 


houses(f) 


0.00 


0.02 


0.03 


2 


T 





Tests were performed on Intel@Pentium@4 with 1.60GHz CPU and 260Mb 
memory, running 2.4.20-prell Linux, using SICStus Prolog Version 3.10.0. 

6 Related Work and Conclusion 

In the context of logic languages the ability to program declaratively increases 
the danger of non-termination. Therefore, termination analysis received consid- 
erable attention in logic programming. In our work we have considered universal 
termination of logic programs with respect to the left-to-right selection rule of 
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Table 4. Examples from [45] 



Program 


Ref 


F 


R Query 


Inst Constr Prs Pr# 


A 


ack 


[73] 


1 


2 ack(b,b,f) 


0.00 




0.05 


3 


T 


arit_exp 


[82] 


0 


6 e(b) 


0.01 




0.05 


12 


T 


associative 


1 


3 normal_form(b,f) 


0.01 


0.03 


0.02 


2 


T 


general norm: term-size except ||op(X, Y)|| 


= i + 2||x|| + ||y 






blocks 


[54] 


12 


5 tower(b,b,b,f) 


0.02 


0.13 






N-k 


credit 


[73] 21.1/2 33 24 credit(b,f) 


0.06 




1.12 


4 


T 


deep_rev 




1 


3 deep(b,f) 


0.00 




0.07 


7 


T 


game 


[4] 


0 


1 win(b) 


0.00 




0.01 


1 


T 








Using gamejnod 












huffman 




2 


8 huffman(b,f) 


0.12 


0.09 


0.07 


2 


T 








code(b,f,f) 






0.01 


1 


T 


P 


[21] 


1 


2 p 


0.01 








N-k 


pql 




2 


2 p(b,f) 


0.01 


0.02 


0.21 


13 


T 








q(b,f) 




0.02 


0.24 


14 


T 








q(f,b) 






0.17 


9 


T 








p(f,b) 










N-k 








q(f>f) 










N-k 


queens 


[21] 


4 


5 queens(b,f) 


0.01 


0.21 


0.38 


4 


T 


sicstusl 


[68] 


3 


4 concatenate(b,f,f) 0.00 




0.01 


1 


T 








concatenate(f,f,b) 




0.01 


1 


T 








member(f,b) 






0.01 


1 


T 








reverse(b,f) 






0.03 


1 


T 








concatenate(f,b,f) 








N-k 








member(b,f) 










N-k 








reverse(f,b) 










N-k 


sicstus2 


[68] 


4 


2 descendant(b,b) 


0.00 








N-k 


sicstusS 


[68] 


2 


7 put_assoc(b,b,b,f) 0.14 




0.61 


9 


T 








get_assoc(b,b,f) 






0.26 


8 


T 


sicstusl 


[68] 


0 


7 d(b,b,f) 


0.01 




0.18 


1 


T 


sublist 


[77] 


1 


2 sublist(b,b) 


0.01 








N-k 


vangelder 


[79] 


1 


10 q(b,b) 


0.00 




8.92 


153 


T 


yalej_p 


[2] 


2 


3 holds(b,b) 


0.00 




0.48 


9 


T 








holds(f,b) 






0.82 


18 


T 








holds(b,f) 










N-k 



Prolog. Early works on termination made no assumptions on the selection rule, 
that is, required termination with respect to all possible selection rules [11,2,12]. 
However, this notion of termination turned out to be very restrictive — the ma- 
jority of real-world programs turn out to be non-terminating with respect to it. 
Thus, most of the authors studied termination with respect to some subset of 
selection rules. The most popular selection rule is left-to-right, as adopted by 
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most of the Prolog implementations. Termination with respect to non-standard 
selection rules was considered, for instance, in [3,38,57,65,69,70]. 

Roughly, the existing work on termination analysis proceeded along three 
important lines: providing necessary and sufficient conditions of termination 
[27,14], providing sufficient (but not necessary) conditions for termination that 
can be verified automatically [37,41,52] and proving decidability or undecidabil- 
ity results for special classes of programs [13,33,64]. Our work is clearly situated 
in the second group: the condition presented in Theorems 3.4 and 3.5 implies 
termination and can be verified automatically. 

While considering sufficient conditions for termination found in the litera- 
ture one can distinguish between transformational [63,6,40,55] and direct ap- 
proaches [23,27,52]. A transformational approach first transforms the logic pro- 
gram into an “equivalent” term-rewrite system (or, in some cases, into an equiv- 
alent functional program). Here, equivalence means that, at the very least, the 
termination of the term-rewrite system should imply the termination of the logic 
program, for some predefined collection of queries. The approach of Arts [6] is 
exceptional in the sense that the termination of the logic program is concluded 
from a weaker property of single-redex normalisation of the term-rewrite system. 
Direct approaches, including our work, do not include such a transformation, 
but prove the termination directly on the basis of the logic program. Unlike 
the transformational approaches they usually do not put restrictions on the pro- 
grams. Another advantage of the direct approaches is that the termination proof 
can be presented to the user in terms of the original program. In the case of the 
transformational approach the user does not necessarily understand the language 
of the transformed object. 

The direct approaches can be classified as local and global. For local ap- 
proaches termination is implied by the fact that for each loop there exists a 
decreasing function (cf. [29,22]). Global approaches require the existence of a 
function that decreases along all possible loops in the program (cf. [27]). Cor- 
rectness of the local approaches is based on Ramsey’s Theorem. Our approach 
is clearly local. This also means that TermiLog can be used not only to prove 
termination but also to provide a user with the reason why non-termination is 
suspected. 

The query-mapping pairs approach originated in the algorithm of [66]. The 
original algorithm of [66] was based on an abstraction of a logic program as a 
datalog program with relations that could be infinite (this type of abstraction 
was proposed in [60]). The problem with this type of abstraction is that it loses 
too much valuable information about the original logic program and, in partic- 
ular, one has to assume that every variable in the head of a rule also appears in 
the body. In the present approach logic programs are handled directly, so all the 
information incorporated in them can be used. There are no restrictions what- 
soever on the logic programs considered. Moreover, the termination condition 
in [66,43,44], which was formulated in terms of circular variants with positive 
forward cycle, is replaced here, with the help of Ramsey’s Theorem, by a much 
simpler condition which gives a stronger termination theorem (cf. [29]). 
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As far as the power of the approach is concerned one has to remember that 
termination is undecidable, so one cannot expect too much. It is rather surprising 
for how many programs the system is applicable. An interesting fact that emerges 
from the experimentation is that in most cases the use of the the term-size norm 
suffices and it is not necessary to use more sophisticated norms. The PSPACE 
hardness result of [41] applies to TemiLog's analysis. Going over the different 
parts of the system one can see that if the maximal arity of a predicate, the 
maximal number of different variables in a clause and the maximal number of 
subgoals in a clause is limited by a relatively small number, one can expect 
our method to behave reasonably, as illustrated by the experiments. There are 
cases in which a linear norm is not sufficient for proving termination. For every 
linear norm a ground term is instantiated enough. In [29] an example is given 
of a program such that queries of the form d{ground, free) terminate, but this 
cannot be proved by any linear norm. There are cases, where the differentiation 
between arguments that are instantiated enough and those that are not, is not 
enough. We can use the query-mapping pairs as before with the only difference 
that we will abstract nodes not to just black and white ones but to a larger, 
though finite, set. For instance, if we have a program 

p(l) p(l). 

p(0) . p(2) . 

and take the term-size norm and a query p(b), the query-mapping pair algorithm 
will say that there may be non-termination. However, we can use the abstractions 
l,g,f, where g means any ground term that is not 1 and / means any term, 
and apply the above algorithm, with the only difference being in the unification 
of the abstractions (both when applying the instantiation pattern of the query 
and when composing query-mapping pairs). In the present case g and I will not 
unify, so we will be able to prove that p{g) terminates. 
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Abstract. Mercury is a logic programming language that is consider- 
ably faster than traditional Prolog implementations, but lacks support 
for full unification. HAL is a new constraint logic programming language 
specifically designed to support the construction of and experimentation 
with constraint solvers, and which compiles to Mercury. In this paper 
we describe the HAL Herbrand constraint solver and show how by using 
PARMA bindings, rather than the standard WAM representation, we can 
implement a solver that is compatible with Mercury’s term representa- 
tion. This allows HAL to make use of Mercury’s more efficient procedures 
for handling ground terms, and thus achieve Mercury-like efficiency while 
supporting full unification. An important feature of HAL is its support 
for user-extensible dynamic scheduling since this facilitates the creation 
of propagation-based constraint solvers. We have therefore designed the 
HAL Herbrand constraint solver to support dynamic scheduling. We pro- 
vide experiments to illustrate the efficiency of the resulting system, and 
systematically compare the effect of different declarations such as type, 
mode and determinism on the resulting code. 



1 Introduction 

The logic programming language Mercury [11] is considerably faster than tradi- 
tional Prolog implementations for two main reasons. First, Mercury requires the 
programmer to provide type, mode and determinism declarations and informa- 
tion from these is used to generate efficient target code. Types allow a compact 
representation for terms, modes guide reordering of literals and multivariant spe- 
cialization, and determinism is used to remove the overhead of unnecessary choice 
point creation. The second main reason for Mercury’s efficiency is that variables 
can only be ground (i.e., bound to a ground term) or new (i.e., first time seen 
by the compiler and thus unbound and unaliased). Since neither aliased vari- 
ables nor partially instantiated structures are allowed, Mercury does not need 
to support full unification; only assignment, construction, deconstruction and 
equality testing for ground terms are required. Furthermore, it does not need to 
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perform trailing, a technique that allows an execution to continue computation 
from a previous program state by logging information about prior states during 
forward computation and using it to restore the states again during backtrack- 
ing. Trailing usually means recording the state of unbound variables right before 
they become aliased or bound. Since Mercury’s new variables have no run-time 
representation they do not need to be trailed. 

This paper investigates whether it is possible to have Mercury-like efficiency, 
yet still support true logical variables. In order to do so we describe our experi- 
ences with HAL, a new constraint logic programming language that compiles to 
Mercury so as to leverage from Mercury’s sophisticated compilation techniques. 
Like Mercury, HAL requires the programmer to provide type, mode and deter- 
minism declarations. Unlike Mercury, HAL was specifically designed to support 
the construction of and experimentation with constraint solvers [2] . 

In particular, HAL includes a built-in Herbrand constraint solver that pro- 
vides full unification (without the occurs check), thus supporting logical vari- 
ables. The Herbrand solver uses PARMA bindings [12] rather than the standard 
variable representation used in the WAM [1,14]. PARMA bindings represent 
equivalence of variables by keeping all equivalent variables in a cycle, as opposed 
to WAM bindings which implement a union- find style equivalence class. The use 
of PARMA bindings allows the solver to use essentially the same term repre- 
sentation for ground terms as does Mercury (see Section 4.4). This is important 
because it allows the HAL compiler to replace calls to the Herbrand constraint 
solver by calls to Mercury’s more efficient term manipulation routines whenever 
ground terms are being manipulated.^ 

An important feature of HAL is its use of type classes to distinguish between 
solver and non-solver types (i.e., types with an associated solver and types with- 
out) and for the hierarchical organisation of constraint solvers. Type classes 
allow a clean separation between a constraint solver’s interface and its imple- 
mentation, thus supporting experimentation with different solvers. We detail 
how HAL’s Herbrand constraint solver fits into this hierarchy. 

Another important feature of HAL is its support for user-extensible dynamic 
scheduling, that is intended to support communication between solvers and con- 
struction of efficient propagation-based solvers. We have therefore designed the 
HAL Herbrand constraint solver to support dynamic scheduling. Here we detail 
how this has been achieved with a PARMA-binding based solver. Again type 
classes allow us to distinguish between solvers that support dynamic scheduling 
and those that do not. 

The HAL programmer may specify for a particular constructor type t whether 
t requires a Herbrand constraint solver (i.e. must support full unification) and, 
if so, whether this solver should support dynamic scheduling. The HAL compiler 
will then automatically generate an appropriate instance of the Herbrand solver 
for t. By requiring that constructor types that need a solver must be specified, 
HAL can simplify the representation, analysis and compilation of constructors 
types that do not need a solver. 

^ Actually, as long as the term is “sufficiently” instantiated. 
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The results of our empirical evaluation of HAL and its Herbrand solver are 
very promising since they show that HAL is capable of using information from 
type, mode and determinism declarations, as well as information about which 
types require true constraint solving and dynamic scheduling, to significantly 
reduce the overhead of Herbrand constraint solving. In particular they show 
that, with appropriate declarations, HAL is almost as fast as Mercury (the extra 
overhead is mainly due to support for trailing), yet allows true logical variables. 
And while without declarations its efficiency is about half that of SICStus Prolog, 
with declarations it is an order of magnitude faster. 

The experiments are also designed to systematically evaluate the effect of 
each kind of declaration (type, mode, determinism, need to support full-uni- 
fication and dynamic scheduling) on the efficiency of HAL programs so as to 
determine where this speedup is coming from. This is possible since, as HAL 
provides full unification and a “constrained” mode, all versions are legitimate 
HAL programs. Our results suggest that mode declarations have the most im- 
pact on execution speed, while determinism declarations provide only moderate 
speedup. Also, although type declarations can also provide speedup, the use of 
polymorphic types can actually lead to slowdown. The overhead of unnecessary 
support for delay is noticeable but small. 

The remainder of the chapter is organized as follows. In Section 2 we first 
introduce the HAL language by means of a simple example, and then examine 
the different declarations in some detail. Section 3 provides the general design 
of HAL’s Herbrand solvers in terms of their interface and associated predicates, 
while Section 4 details their actual implementation. Next, we examine how dy- 
namic scheduling is defined in HAL in Section 5 before detailing how we imple- 
ment dynamic scheduling for Herbrand solvers in Section 6. We give our empir- 
ical evaluation in Section 7, discuss related work in Section 8, and conclude in 
Section 9. 

2 The HAL Language 

This section provides a brief overview of the HAL language, concentrating on 
its support for Herbrand constraints; for more details see [2]. The basic HAL 
syntax follows the standard Constraint Logic Programming (CLP) syntax, with 
variables, rules and predicates defined as usual (see, e.g., [10] for an introduction 
to CLP). The module system in HAL is similar to that of Mercury. A module is 
defined in a file, it imports the modules it uses and has export annotations on 
the declarations for the objects that it wishes to be visible to those importing 
it. Selective importation is also possible. 

The core language supports integer, float, character, and string data types 
plus polymorphic constructor types (such as lists) based on these base types. 
However, this support is limited to assignment, testing for equality, and con- 
struction and deconstruction of ground terms. More sophisticated manipulation 
is available by importing (or building) a constraint solver for each of the types 
involved. 
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As a simple example, the following program is a HAL version of the Towers 
of Hanoi benchmark which uses difference lists to build the list of moves. 



module hcuioi. (^1) 

: - import int . (L2) 

export typedef tower -> (a ; b ; c) . (^3) 

export typedef pair(T) -> (T - T) . (^4) 

export typedef move = pair (tower) . (L5) 

export typedef list(T) -> ( [] ; [T|list(T)]) deriving herbrand. (L6) 

export pred hanoi (int , list (move) ) . {L7) 

mode hanoi (in ,no) is semidet . (L8) 

hcUioi(N,M) hanoi2(N, a,b, c ,M- [] ) . (L9) 

pred hanoi2(int .tower .tower .tower ,pair(list (move) )) . (^10) 

mode hanoi2(in .in .in .in .oo) is semidet. (^H) 



hanoi2(N.A.B.C.M-Tail) 

( N = 1 -> 

M = [A-CiTail] 

; N > 1. 

N1 is N - 1. 

hanoi2(Nl.A.C.B.M-Taill) . 

Taill = [A-C|Tail2]. 
hanoi2(Nl.B.A.C.Tail2-Tail) 

). 

The first line (LI) states that the file defines the module hanoi. Line (L2) 
imports the standard library module int which provides (ground) arithmetic 
and comparison predicates for the type int. Lines (L3), (L4), (L5) and (L6) 
define constructor types used in and exported by this module. The type tower 
gives the names of the towers, pair defines a polymorphic pairing type, move 
defines a move as a pair of towers using a type equivalence, and list defines 
polymorphic lists. The type declaration for lists contains the directive deriving 
herbrand indicating to the HAL compiler to generate an instance of the Her- 
brand constraint solver for list types. 

Line (L7) declares that this module exports the predicate hanoi/2 which 
has two arguments, an int and a list of moves. This is the type declaration for 
hanoi/2. 

Line (L8) is an example of a mode of usage declaration. The predicate 
hanoi/2’s first argument has mode in meaning that it will already be ground 
(i.e., bound to a ground term) when called, the second argument has mode no 
meaning that it will be new (i.e., never seen before) on calling and old (i.e., possi- 
bly “constrained”) on return.® The second part of the declaration “is semidet” 
is a determinism statement. It indicates that hanoi/2 either succeeds with ex- 

® We could have given the mode out which means that the list will be ground on 
return, but HAL’s mode checker is not yet powerful enough to confirm this. 
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actly one answer or fails. In general, predicates may have more than one mode 
of usage declaration. 

The rest of the file contains the rules defining hanoi/2 and declarations and 
rules for the auxiliary predicate hanoi2/5 (here the mode oo means the argument 
is possibly “constrained” on both call and return). 

2.1 Declarations 

As we can see from the above example, HAL allows programmers to annotate 
predicate definitions with type, mode, determinism declarations (modelled on 
those of Mercury) . Like Mercury, it also provides purity declarations and type 
classes. Here we examine these issues in more detail. 



Type Declarations: Type declarations detail the representation format of a 
variable or argument. Types are defined using (polymorphic) regular tree type 
statements such as those shown in (L3)-(L6). As another example, the statement 

typedef tree(K,I) -> (item(K,I) ; node (tree(K, I) ,K,tree (K, I) ) . 

defines the type constructor tree/2 for binary keyed tree types with key type 
K and item type I . The definition states that type constructor tree/2 has two 
functors: item/2, which represents a leaf node and is used to store an item with 
its key, and node/3, which represents an internal binary tree node and is a used 
to store a key (for directing the search) and the two subtrees. 

Equivalence types are also allowed. For example, the statement 

typedef move = pair (tower) . 

defines the type constructor move/0 as an equivalent name for type constructor 
pair/1 with type constructor tower/0 as argument. Note that the right-hand 
side of an equivalence type is only allowed to contain type constructors not 
functors. 

Ad-hoc overloading of predicates and functions is allowed, although the defi- 
nitions for different type signatures must appear in different modules. For exam- 
ple, in the module hanoi the binary function is overloaded and may mean 
either integer subtraction or difference list pairing. Overloading is important in 
CLP languages since it allows the programmer to overload the standard arith- 
metic operators and relations (including equality) for different types, allowing a 
natural syntax in different constraint domains. 



Mode Declarations: Mode declarations specify how execution of a predicate 
modifies the “instantiation state” of its arguments. A mode is associated with 
each argument of a predicate and has the form Inst\ — > Inst 2 where Inst\ de- 
scribes the input instantiation state of the argument and Inst^ describes the 
output instantiation state. Arguments of unknown structure (i.e., those associ- 
ated with a variable type) can only have one of the base instantiation states: 
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new, old or ground. We say that program variable X is new if it has not been 
seen by its associated constraint solver (if one exists), old if it has, and ground 
if X has a known fixed value. 

The base modes are mappings from one base instantiation to another: we use 
two letter codes (oo, no, og, gg, ng) based on the first letter of the instantiation, 
e.g. ng is new-aground. The standard modes in and out are synonyms for gg 
and ng, respectively. 

For terms with known structure, such as a list of moves, more complex in- 
stantiation states (lying between old and ground) may be used to describe the 
state. An example is 

instdef bound_dif f list -> bound(old - old). 

which defines an instantiation state in which the difference list pair is certainly 
constructed, but the elements in the pair may still be unbound variables. Note 
that the bound keyword may be dropped from the definition since this is HAL’s 
default. 

Fully understanding the above instantiation definition is more complex than 
it may first appear, since this requires combining the instantiation with the type. 
This is because the actual meaning of old for a program variable X depends 
on whether its constructor type t is a solver- type or not. If t is a solver type, 
it indicates that X might be possibly unbound. If it is not, X must be bound. 
This applies recursively to all types associated to the arguments of the term to 
which X is bound (if any). This allows the base instantiation old to be used as 
a shorthand for the most general instantiation state of an initialized (i.e., not 
new) program variable. 

For example, in the instantiation bound_dif f list the base instantiation old 
is used for variables with type list (move) (or, equivalently, 
list (pair (tower))). Thus, it is actually a shorthand for the instantiation 

instdef old_list_of jnove -> ifbound( [] ; [oldjnove I old_list_of jnove] ) . 

instdef oldjnove -> bound(old_tower-old_tower) . 

instdef old_tower -> bound(a; b; c) . 

which indicates that a variable with instantiation old_list_of _move may be 
unbound (since it is enclosed by the if bound keyword), but, if bound, it is 
either bound to an empty list or to a list with a bound move in the head, and a 
tail with the same instantiation state. Note that old means bound for the pair 
and tower constructor types since they are not solver types. ® 

It is important to note that HAL does not allow nesting of the base instan- 
tiation new within a structure, i.e., all arguments in the structure must already 
be either ground or old. As we will see later, this ensures that all subparts of a 
data structure properly exist on the heap. 

Instantiation declarations can be parametric in their instantiation variables. 
For example, the instantiation definition 

® The ifbound form of instantiation definition is not available to the programmer, 
and is only generated internally by translation from old. This is because arbitrary 
ifbound instantiations are not checkable without sophisticated sharing analysis. 
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instdef bound_list (I) -> bound ( [] ; [ I I bound_list (I) ]). 



defines lists whose skeleton is fixed, and whose elements have instantiation I. 

As we have seen, instantiations in HAL can be quite powerful. However, 
defining such instantiations can also be laborious, especially since they are often 
type specific. Fortunately, being able to use old as a shorthand for the most 
general instantiation state of any type as illustrated above, means the user rarely 
needs to define such instantiations. 

Finally, modes can be defined using statements of the form Inst\ Inst 2 
where, as indicated before, Insti describes the input instantiation state and Inst 2 
describes the output instantiation state. Equivalence modes are also allowed. 
Examples are 



- modedef 

- modedef 

- modedef 

- modedef 

- modedef 



in(I) -> (I -> I), 
in = in(ground) . 
out (I) -> (new -> I), 
out = out (ground). 

new2old_list_of _move = out (old_list_of _move) . 



Note that mode definitions can be parametric, i.e., contain instantiation variables 
such as I above. This is, however, not the case for predicate mode declarations 
which cannot contain variables. For more details about mode and instantiations 
in HAL the reader is referred to [4]. 



Determinism Declarations: Determinism declarations detail how many an- 
swers a predicate may have. HAL uses the Mercury hierarchy: nondet means 
any number of solutions; multi at least one solution; semidet at most one solu- 
tion; det exactly one solution. The determinism erroneous indicates a run-time 
error, while failure indicates the predicate always fails. 



Type Class Declarations: HAL also provides type class and class instance 
declarations based on those of Mercury [7]. Type classes support constrained 
polymorphism by allowing the programmer to write code that relies on para- 
metric types having certain associated predicates and functions. In particular, a 
class provides a name for a set of types (which are parameters to the type class) 
for which certain predicates and/or functions (called the methods) are defined, 
and which form its interface. 

For example, one of the most important built-in type classes in HAL is 

class eq(T) where [ 
pred T = T, 

mode oo = oo is semidet ] . 

which defines types T that support equality testing, i.e., for which an implemen- 
tation of the method =/2 for mode of usage oo = oo exists. Note however that, 
like Mercury, all types in HAL have an associated “equality” for modes in=out 
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and out=in, which correspond to assignment, construction or deconstruction, 
and which are implemented using specialised built-in procedures rather than 
implementation of the more general =/2 method. 

Instances of the eq/1 class can be specified, for example, by the declaration 

instance eq(pair(T)) <= eq(T) where [ 

pred(=/2) is pair_l_SolveEqual ] . 

which declares the type pair(T) to be an instance of the eq/1 type class, as 
long as T is also an instance of the class, and as long as there exists a predicate 
called pair_l_SolveEqual which appropriately implements the =/2 method for 
type pair (T) . Most types support testing for equality, the main exception being 
for types with higher-order subtypes. Therefore, HAL automatically generates 
instances of eq/1 (including the predicates implementing the =/2 method) for 
all constructor types (such as pair/1) which do not contain higher-order sub- 
types and for which the programmer has not already declared an instance, thus 
removing this burden from the programmer. 

One major motivation for providing type classes in HAL is that they provide a 
natural way of specifying a constraint solver’s interface and allow us to naturally 
capture the notion of a type having an associated constraint solver: It is a type 
for which there is a method for initialising variables and a method for defining 
true equality. Thus, the built-in solver/ 1 type class is defined by: 

class solver (T) <= eq(T) where [ 
pred init(T), 
mode init(no) is det ]. 

The above declaration indicates that the solver/ 1 type class provides initial- 
isation method init/1. The class definition also indicates that solver/1 is a 
subclass of eq/1 and, thus, any instance of solver/1 must also be an instance 
of eq/1. Therefore, for type T to be in the solver/ 1 type class, there must 
exist predicates implementing the methods init/1 and =/2 for this type with 
mode and determinism as shown. The HAL compiler automatically inserts calls 
to init/1 to initialize new variables and may generate calls to =/2 because of 
normalization. 

Purity Declarations: Purity declarations [3] capture whether a predicate is 
impure (affects or is affected by the computation state), or pure (otherwise). 
By default predicates are pure. Any predicate that uses an impure predicate 
must have its predicate declaration annotated as either impure (so that it is 
also impure) or trust pure (so that even though it uses impure predicates 
it is considered pure). Calls to pure predicates can be reordered by the HAL 
compiler during mode analysis but predicate calls are never reordered past an 
impure predicate call. 

Combined Declarations: For predicates with only one mode, HAL, as Mer- 
cury, provides syntax for combining all declarations into a single line. For exam- 
ple, lines (L7) and (L8) in the hanoi example can be expressed as 
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export pred hanoi (int : : in, list (move) :: no) is semidet . 

We will often use this compact form in the sequel. 

3 Herbrand Constraint Solvers 

Term manipulation is at the core of any logic programming language. As indi- 
cated previously, the HAL base language only provides limited operations for 
dealing with terms, corresponding to those supported by Mercury. If the pro- 
grammer wishes to make use of more complex constraint solving for terms of 
some type t, then they must explicitly declare that they wish to use a Herbrand 
constraint solver for t. 

This is achieved by adding the annotation deriving herbrand to the type 
definition. The HAL compiler will then automatically generate a Herbrand con- 
straint solver for that constructor type. In order to do this, the compiler makes 
use of the following predicates and type classes defined in the system module: 

export pred herbrand_init (T: :no) is det . 

class herbrand(T) <= solver(T) where [] . 

export impure pred var(T::oo) <= herbrand(T) is semidet. 
export impure pred nonvar (T: : oo) <= herbrand(T) is semidet. 
export impure pred ===(T: : oo ,T: : oo) <= herbrand(T) is semidet. 

The first predicate implements the init/1 method for any Herbrand type de- 
clared as instance of the solver/ 1 class. The herbrand/ 1 type class will be used 
to identify the set of Herbrand types, i.e., the constructor types which support 
full unification (since every instance of herbrand (T) must also be an instance of 
solver (T)), and a number of non-logical operations commonly used in Prolog 
style programming such as var/1, nonvar/1, and ===/2. The last three predi- 
cates implement such non-logical operations for any Herbrand type. Predicates 
nonvar/1 and var/1 can be used to test if a Herbrand variable is bound or not, 
respectively. Predicate ===/2 succeeds only if both arguments are identical un- 
bound Herbrand variables.^ Note that we could have included these predicates 
as methods in the herbrand/ 1 class instead of simply adding the class constraint 
herbrand (T) to their predicate type declaration. However, since the implemen- 
tation of such methods will be identical for all types in the class, that would 
only complicate matters. 

As mentioned before, the HAL compiler automatically generates a Herbrand 
constraint solver for any constructor type annotated with deriving herbrand. 

^ ===/2 is analogous to Prolog ==/2 but only succeeds if both arguments are unbound 
variables. Determining if two non-variable arguments are identical in HAL would 
require recursively traversing and comparing the sub-terms in the arguments. Hence, 
every subtype of the term would require the ability to test equivalence. Simply testing 
if two variables are identical only depends on the topmost type constructor. 
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In doing this the compiler generates appropriate instances for the herbrand/1, 
solver/1 and eq/1 classes. For example, in the hanoi module, since the types 
(move, tower and pair) are only manipulated when bound and, therefore, do 
not require the full power of unification, these types were not annotated with 
deriving herbrand. On the other hand, since the program uses difference lists, 
a Herbrand constraint solver is needed for the list type. Hence, the list type is 
defined as 

typedef list(T) -> ( [] ; [T I list(T)]) deriving herbrand. 

The HAL compiler will then automatically generate the following declarations: 

instance eq(list(T)) <= eq(T) where [ 

pred(=/2) is list_l_SolveEqual ] . 

instance solver (list (T) ) <= eq(T) where [ 

pred(init/l) is system :herbrand_init ] . 

instance herbrand (list (T) ) <= eq(T) . 

plus the definition of the predicate list_l_SolveEqual which implements uni- 
fication specialised for the list data type as the general =/2 method for lists. 
Exactly how this is done will be discussed in detail in the following section. Note 
that herbrand_init/l, implementing the init/1 method, is already defined in 
the system module. 

The reader might be wondering why there is a need for the programmer to 
distinguish types for which Herbrand solving is supported from those for which 
it is not, since one could have simply defined all constructor types as Herbrand 
types, provided full unification for them, and then relied on the compiler to 
replace calls to the Herbrand solver by more efficient calls to the term assignment, 
construction, etc, procedures provided by Mercury. The main reason to separate 
the types is one of efficiency. The problem is that the compiler is not always 
capable of detecting whether a more efficient procedure can be used since to 
do so requires examining reordering of literals. Another reason is that a slightly 
more compact representation can be used for non-Herbrand terms since there is 
no need to have a tag for the case where the term is a variable. Separating the 
types means that these overheads will always be avoided in the case of the far 
more common non-Herbrand types. 

The above decision improves efficiency at the cost of code duplication. For 
example, since the type of lists with associated Herbrand solving support is 
different from that of lists without support, HAL needs to provide two library 
modules, one for each type. Furthermore, terms of one type cannot be unified 
with those of the other type. 

4 Implementing Herbrand Constraint Solving 

In this section we describe how Herbrand constraint solvers are implemented 
in HAL. We start by briefly introducing the WAM and Mercury approaches to 
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term representation and manipulation, as well as describing the PARMA binding 
scheme of Taylor. Then we show how the PARMA binding scheme is used to 
implement Herbrand constraint solvers in HAL. 



4.1 Term Representation and Manipulation in the WAM 

The Warren Abstract Machine (WAM) [14,1] forms the basis of most modern 
Prolog implementations. Terms are stored on a heap,® which is an array of data 
cells. A cell is usually broken into two parts: a tag and a reference pointer. The 
most important tag values are REF (a variable reference), ATM (an atomic 
object, i.e., a non-variable term with arity 0), and STR (a structure, i.e., a 
non- variable term with one or more arguments). An unbound variable (on the 
heap) is represented by a cell with a REF tag and a pointer to itself. An atom 
is represented by a cell with tag ATM and a pointer into the atom table. The 
structure f(ti, . . . ,t„) is represented by a STR tagged pointer to a contiguous 
sequence of n -|- 1 cells. The first cell contains the functor f and the arity n, and 
the next n cells hold the representations of ti, . . . , For example, a possible 
heap representation of the term f{h{X),Y, a, Z) is shown in Figure 1. 



(Y) (Z) 




Fig. 1. WAM heap representation of f{h{X), Y, a, Z). 



The native representation of base types such as integers and floats (usually) 
uses the entire cell. WAM implementations either treat them as atoms, wrap 
them in a special functor, or assign tag values for the types and use the remaining 
bits to store the data. 

Unification of two objects on the heap proceeds as follows. First, both objects 
are dereferenced. That is, their reference chain is followed until either a non-REF 
tag or a self reference is found. If at least one of the dereferenced objects is a 
self reference (i.e. an unbound variable) that object is modified to point to the 
other object. Otherwise, the tags of the dereferenced objects are checked for 
equality. In the case of an ATM tag, they are checked to see they have the same 
atom table entry. In the case of a STR tag, the functor and arity are checked for 
equality, and, if they are equal, the corresponding arguments are unified. 

For example, consider the heap state of Figure 1. If we first unify Y with 
the heap variable Z and then with another heap variable V, we obtain the heap 

® For simplicity, we ignore stack variables. 
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(T) f/4 STR 



(U) h/1 REF 



(Y) ^ (Z) 

ref; Iatm; , I ref 



1 (X)^ atom table entry "a" 



(T) I B4 STR I I I REF I ' |aT Mj | REF 



(Y) REF 



(U) h/1 REF 



table entry "a" 
(Y) I STR 



(a) WAM representation (b) After processing Y = h{X) 

Fig. 2. WAM term and variable binding schemes 



shown in Figure 2(a). If we then unify Y with h(X) we obtain the heap shown 
in Figure 2(b). Notice how reference chains can exist throughout the heap. 

The address of any pointer variable modified by unification is (conditionally) 
placed in the trail. Since the modified variable is always a self reference, its 
previous state can be restored from this information alone. 

4.2 Term Representation and Manipulation in PARMA 

In the PARMA system [12], Taylor introduced a new technique for handling 
variables that avoided the need for dereferencing (potentially long) chains when 
checking whether an object is bound or not. A non-aliased non-bound (i.e. free) 
variable on the heap is still represented as a self-reference as in the WAM. The 
difference occurs when two free variables are unified. Rather than pointing one 
at the other, as in the WAM, a cycle of bindings is created. In general n variables 
which are aliased are represented by n cells forming a cycle. When one of the 
variables is equated to a non-variable all variables in the cycle are changed to 
direct (tagged) pointers to this structure and changes are trailed. 



(T) 

(U) 




(a) PARMA representation (b) After processing Y = h{X) 

Fig. 3. PARMA term and variable binding schemes 



For example, the PARMA heap structures corresponding to Figures 2(a) and 
(b) are shown in Figures 3(a) and (b), respectively. 

The PARMA scheme for variable representation has the advantage that deref- 
erencing of bound terms on the heap is never required. However, it has three 
potential disadvantages: 

(a) Checking if two unbound variables are equivalent is more involved, and is 
required for variable-variable binding. Essentially, each variable’s cycle of 
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aliased variables may need to be traversed. Furthermore, trailing of each 
variable requires two words (the variable’s position and its old value). 

(b) When instantiating a variable cycle (conditional) trailing must occur for each 
cell in the cycle (rather than one as for the WAM). Also, as before, the trail 
requires two words. 

(c) When creating a structure that will hold a copy of an already existing un- 
bound variable, the cycle of variables grows, and trailing potentially occurs. 

However, the impact of each of these factors is dependent on the length 
of the cycles that are manipulated. Since, as we shall see, cycles rarely grow 
beyond length one (a self pointer), the overhead involved is limited, although 
not completely eliminated (particularly in the case of trailing overhead). 

It is important to note that only heap variables can be placed in a variable’s 
alias cycle. An unbound initialized variable on the stack or in a register points 
into a cycle on the heap. If this cycle is then bound, the stack or register variable 
becomes a pointer to a bound object. This means that when accessing data 
through a stack variable or register, the PARMA scheme sometimes requires a 
single step dereference. 

4.3 Term Representation and Manipulation in Mercury 

Types in HAL with no solver attached are identical to Mercury types. In this 
section we explain Mercury’s approach to type representation and manipulation. 

Recall that variables in Mercury can only be either new (which means they 
do not have a representation) or ground. Thus, there is no need for the REF 
tagged references used in the WAM. This combined with the fact that types are 
always known at compile time, allows Mercury to use a compact type-specific 
representation for terms in which tags are used instead to distinguish among the 
different type functors defined for the type. Hence, an object of a base type, like 
an integer, is free to use its entire cell to store its value. For more details see [11]. 
As an example, consider the Mercury type for lists:® 

typedef list(T) -> ( [] ; [T I list(T)] ). 

Given a term of type list(T) there are only two possibilities for its (top-level) 
value, it is either nil “ [] ” or cons “ [ I ] ” . Mercury reserves one tag value (NIL) for 
nil, and one (CONS) for cons. Since the nil reference does not need any further 
information the pointer part is 0. A cons structure is simply two contiguous cells: 
the first is a representation of the first element (e.g. a tagged pointer or a 32 bit 
int) and the second is a reference to the rest of the list. 

Assuming 32 bit words and aligned addressing, the low two bits of a pointer 
are zero. In Mercury these bits are used for storing the tag values, hence four 
different tags are available. For types with more than four functors, the repre- 
sentation is modified. Since for a constant functor (such as NIL) the remaining 
part of the cell is unused, the remaining 30 bits can be used to store different 

® For uniformity we use HAL syntax rather than that of Mercury. 




512 



Bart Demoen et al. 



constant functors. For types with more non-constant functors than remaining 
tags, the Mercury representation uses an extra cell to store the identity of the 
extra functors, much like the WAM representation (although the arity of the 
functor does not need to be stored since the type information gives this). In 
what follows, we will ignore this for simplicity. 

Mercury performs program normalization, so that only two forms of equations 
are directly supported: X = Y and X = f{Ai, . . . , A„) for each functor / where 
Al,. An are distinct variables. 

As mentioned before, equations of the form X = V are only valid in three 
modes: in = out, out = in, and in = in. For the first two modes, the ground 
variable is copied into the new. For the third mode a procedure to check that the 
two terms are identical is called. Mercury automatically generates a specialized 
procedure (which we shall refer to as unify_gg) that does this for each type. 

The equation X = f(Ai, . . . , A„) is only valid in two modes: out = in (i.e., 
X is new and Ai, ..., A„ are all ground) and in = out (i.e., X is ground 
and each Ai, . . . , A„ is new). In the first case a contiguous block of n cells is 
allocated, the values of Ai , . . . , A„ are copied into these cells, and X is set to a 
pointer to this block with an appropriate tag. In the second case, after testing 
that X is bound to the appropriate type functor, the values in the contiguous 
block of n cells that it points to are copied into Ai, . . . , A„. The case where 
some of Al, . . . , A„ are new and some ground (e.g. A4) is handled by replacing 
each such variable in the equation by a new variable (e.g. A4) and a following 
equation (e.g. A4 = A4). 

As an example, consider how Mercury will (attempt to) compile the equation, 
T = f{h{l),Y,a,Y) where Y and T are new. First, it is normalized to give the 
equations X = 1 ,U = h{X),S = a, Z = Y,T = f{U,Y,S,Z). The first three 
equations can be compiled to “construct” variables X, U and S, respectively. 
The two remaining equations cannot be compiled since they do not satisfy one 
of the above modes. If later in the goal Y is given a ground value by literal I, 
then these two equations can be reordered after I and compiled to construct Z 
and T. 

4.4 Term Representation and Manipulation in HAL 

Since HAL is compiled into Mercury, it makes considerable sense for HAL to 
use as far as possible Mercury’s basic term manipulation functions even for 
types that sometimes require full unification. The idea is that, when possible, 
term equations should be compiled into Mercury’s basic term manipulations (as- 
signment, construction, deconstruction, and equality testing) rather than calling 
the more expensive unification solving method. However for this to be possible, 
terms in HAL must use a term representation which is compatible with that of 
Mercury. 

HAL employs the PARMA approach to variable binding with the Mercury 
term representation scheme. The main reason for using the PARMA approach, 
rather than that of the WAM, is that when a term structure becomes ground in 
the PARMA scheme it has no reference chains within it. Hence, once it is ground 
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it becomes a legitimate Mercury term. Furthermore, even when a term is only 
partially bound, the HAL compiler can (mis)use the efficient Mercury operations 
to manipulate the bound part of the term, since they will still give the desired 
behaviour. In order to do this, HAL reserves the tag 0 in all Herbrand solver 
types for use as the REF tag. This means that instead of the four tags generally 
available for representing a type in Mercury there are only three available for a 
solver type. 

For example, given the type declarations: 

typedef erk -> (f(erk, erk, atm, erk) ; h(erk) ; g) deriving herbrand. 

typedef atm -> (a ; b ; c ; d ; e) . 

the HAL representation of the term T = f{h{X),Y, a, Z) is shown in Figure 4. 



(Y) (Z) 




Fig. 4. HAL heap representation of f{h{X), Y, a, Z). 



Dereferencing: As in the PARMA system, only heap variables can be placed 
in a variable’s alias cycle. Thus, a stack variable or a register must be a pointer 
somewhere into the cycle. As a result, when accessing data through a stack 
variable or register, HAL sometimes requires a single step dereference. Consider 
the following goal, where all variables are initially new: 

init(Z), X = Z, X = [a], X = [A|B], 

Figure 5 illustrates the changes to the heap and the registers holding X and Z 
during the execution of the first 3 atoms in the goal. Note that (due to the way 
Mercury handles registers) X and Z remain as pointers to the instantiated list 
rather than being updated to its value (what it points at on the heap). Before 
the execution of the atom X = [A I B] we must perform a one step dereference 
so that we can handle the equation simply as a Mercury deconstruct. 

HAL produces Mercury code that maintains the assumption that: 

— an old Herbrand object may need to be dereferenced. 

— a bound Herbrand object is already dereferenced. 

To do so, explicit dereferencing instructions are added to the output Mercury 
code, that create a new dereferenced version of a variable. Such dereferencing 
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Fig. 5. Register and heap representation for each stage of init(Z), X=Z, 
X=[a], 



instructions are only required to be added to the user’s code when the compiler 
detects that the instantiation state of a variable changes from old to some bound 
instantiation. For example, the goal above is translated to Mercury code of the 
form 

init(Z), X = Z, X = [a], X_Derefd = deref(X), XJlerefd = [A|B], 

The deref pseudo-C code simply returns the value pointed to by its argument 
if this is not a variable^*^ 

deref (X) { 

if (deref d_var(X) && ! deref d_var(*X) ) return *X; 
return X; } 

The code deref d_var to check whether a pointer is a variable pointer is simply 

derefd_var (X) { return (tag(X) == REF) ; } 

The code var to check whether an arbitrary old term is a variable must do 
the one step dereference. It is defined as follows: 

var(X) { return (deref d_var(X) && derefd_var (*X) ) ; } 

The code for nonvar simply uses var. 

nonvar(X) { return !var(X); } 



Unification: HAL, as Mercury, normalizes programs so that only two forms of 
equations arise: X = Y and X = f{Ai , . . . , Ap) (where each Ai is a distinct vari- 
able). The compiler translates these equations into calls to appropriate Mercury 
and C code to implement the PARMA variable scheme as follows. 

Consider an equation of the form X = Y . For modes in = out, out = in, 
and in = in we simply call the Mercury’s more efficient procedures. If one 

Importantly the code does not return the next address in a variable chain, but the 
original address. This will be required later for correctness of dynamic scheduling. 
For in = in, this is correct only if X and Y contain no non-Herbrand solver types. 
For the purposes of this paper we will ignore this. 
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of the variables is new and the other one is old, we can simply assign the old 
variable to the new. This is identical to what Mercury does for this case (with 
the understanding that old is interpreted as ground) and we can therefore again 
use Mercury’s procedure. When both X and V are new an initialization init (Y) 
is added beforehand. The initialization allocates a new cell on the heap, makes 
it a self-pointer and returns a reference to this cell in V. This makes V old and 
the previous case applies. The (psuedo-C) code for init is simply 

init(X) { X = top_of_heap++; *X = X; } 

The only remaining case, where both X and V are old, requires true unifica- 
tion. We replace the equation with a call to the Herbrand unification procedure 
unify_oo, which is automatically generated by the HAL compiler for the solver 
type t of A and V. A simplified version of the code for unify_oo is shown in 
Figure 6. In the actual code (which is specialised for each type rather than being 
polymorphic) the calls to nonvar and deref are folded into one call. 



pred unify_oo(T,T) <= herbrand(T) . 
mode unify_oo (00,00) is semidet . 
unify_oo(X,Y) 

(nonvar (X) -> 

(nonvar (Y) -> 

unify_val_val (deref (X) , deref (Y) ) 

; unif y_var_val(Y, deref (X) ) ) 

; (nonvar (Y) -> 

unif y_var_val (X , deref (Y) ) 

; unify_var_var (X, Y) ) ) . 

Fig. 6. HAL code for equating two old objects of type T. 



The procedure unif y_val_val is similar to Mercury’s procedure unify_gg 
except it calls unify_oo on arguments of unified terms rather than unify_gg. 
It assumes that its arguments are dereferenced. For example, unify_val_val 
and unify_gg for list types are shown in Figure 7. In practice the final calls to 
unify_oo and unify_gg would be specialized since we know they apply to list 
arguments (and thus we know the name of the predicate which implements the 
method) . 

The procedure unify_var_val in Figure 8 unifies a variable and a non- 
variable. This means modifying all the variables in the cycle to directly refer 
to the non-variable, and trailing the changes. The procedure assumes the second 
argument is dereferenced. 

The procedure unify_var_var shown in Figure 9 unifies two variables. This 
means checking that the variables are not already the same, and then joining 
the cycles together, trailing the change. Note that, unlike the case for the WAM, 
the code for unifying two variables is symmetric, treating each variable the same 
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pred unify_gg(list(T) ,list(T)) <= eq(T) . 
mode unify_gg(in, in) is semidet . 
unify_gg( [],[]). 
unify_gg( [X|Xs] , [Y|Ys]) 
unify_gg(X,Y) , 
unify_gg(Xs , Ys) . 

instdef nonvar_list -> bound ([] ; [old I old]), 
pred unify_val_val (list (T) , list (T) ) <= eq(T) . 

mode unify_val_val(in(nonvar_list) ,in(nonvar_list)) is semidet. 
unify_val_val( [],[]). 
unify_val_val( [X I Xs] , [Y|Ys]) 

unify_oo(X,Y) , 
unify_oo(Xs,Ys) . 

Fig. 7. HAL code for equating two nonvariable objects of type list{T). 



unify_var_val(X,Y) { 

QueryX = X; 
repeat 

{ Next = *QueryX; 

trail (QueryX) ; /* trail chain pointer */ 

*QueryX = Y; /* replace by value */ 

QueryX = Next ; } 
until (QueryX == X) } 

Fig. 8. Pseudo-C code for HAL unification of a variable and value 



unify_var_var (X,Y) { 

QueryX = *X; 

QueryY = *Y; 

while (QueryX != Y tt QueryY != X) 
if (QueryX != X && QueryY != Y) 
QueryX = * QueryX; 

QueryY = * QueryY ; 

} else { 

trail(X); trail(Y); 

Tmp = *X; *X = *Y; *Y = Tmp; 
break; } } 



/* while equality not found */ 
{ /* if loops unfinished */ 

/* advance */ 

/* else trail X and Y */ 

/* merge chains */ 

/* and finish */ 



Fig. 9. Pseudo-C code for HAL unification of two variables 



way. Also note that the algorithm traverses the two cycles in parallel stopping 
when the shortest cycle has been completed. 

Processing an equation of the form X = /(Ai, . . . , Ap) is more complicated 
since we may have to create objects on the heap. First, let us consider the simple 
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case when X is bound, then the case when X is new, and finally the most complex 
case: when X is old. 

The easiest case for handling an equation of the form X = f{Ai, . . . ,Ap) 
occurs when X is known to be bound and A\, . . . ,Ap are new. This is simply 
left to Mercury. If one (or more) of Ai, . . . , Ap are not new, they are replaced by 
new variables and equations as in the Mercury case. 

The second case, when X is new, will require the construction of a new 
structure on the heap. For this to happen, and since arguments within a structure 
are not allowed to be new in HAL, each variable Ai with instantiation new must 
first be initialised. If the type of the variable is known at compile time to be a 
Herbrand type or other solver type, initialisation is not a problem. If, however, 
the type is known to be neither Herbrand nor any other solver-type, a compile- 
time error can be issued. Finally, if the type of the variable is not known at 
compile-time (i.e., it is a variable type), we must call a general initialisation 
procedure that decides what to call at run-time and can result in a run-time 
error if the type ends up not being a solver type. This would be simple if one 
could at run-time check whether a variable has a type which is an instance 
of certain type class (such as herbrand/1 or solver/1). However, this is not 
yet possible in Mercury. Thus, in order to support this and other type-related 
queries, HAL defines the following internal type class: 

class hal_type_inf o(T) where [ 

pred maybe_init (T: :no) is det, 

pred is_type_herbrand(T: : oo) is semidet, 

pred is_type_solver (T: : oo) is semidet]. 

where maybe_init/l initialises the variable in the heap if this is needed before 
performing a construction, is_type_herbrand succeeds if the type is Herbrand, 
and is_type_solver succeeds if the type is a non-Herbrand solver-type. HAL 
will also automatically create an instance of hal_type_inf o/l for every user- 
defined type t as follows. If t is neither Herbrand nor a solver type, the instance 
is: 



instance hal_type_inf o(t) where [ 

pred (maybe _init) is error, 
pred(is_type_herbrand) is fail, 
pred(is_type_solver) is fail] . 

where error will issue a run-time error, and fail will always fails. If t is not a 
Herbrand but a solver type, the instance is: 

instance hal_type_inf o(t) where [ 
pred (maybe _init) is init , 
pred(is_type_herbrand) is fail, 
pred(is_type_solver) is true] . 



where init is the predicate appearing in the solver (t) as the implementation 
of method init/1, true always succeeds and fail always fails. Finally, if t is a 
Herbrand type, the instance is: 
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instance hal_type_inf o(t) where [ 

pred (maybe _init) is dummy _init, 
pred(is_type_herbrand) is true, 
pred(is_type_solver) is fail] . 

where dummy _init does nothing (as we will see, Herbrand variables do not require 
initialisation before a construction), and true and fail are as before. 

Using the above predicates, the construction of term X = f{Ai,...,Ap) 
can be done as follows. Let us assume that all variables have variable type, 
variables Ag ^ , • ■ • j Ag^ are old while A^ , ■ ■ ■ , Am are new. Then, the translation 
to Mercury is essentially: 

maybe_init (A„^ ) , . . . , maybe_init (A„, ) , 

X = f(Ai, ..., Ap), 

(is_type_herbrand(A„j^ ) -> A„j = init_heap(X,ni — 1) ; true), 

. . . j 

(is_type_herbrand(A„, ) -> A„j = init_heap(X,rii — 1) ; true), 

(is_type_herbrand(Aoi ) -> fix_copy(X,Oi — 1) ; true), 

. . . j 

(is_type_herbr£uid(Ao,„ ) -> fix_copy(X,Om — D ; true) 

where the method maybe_init is first used to initialise all non-Herbrand new 
variables. Once this is done, the construction can be scheduled as a Mercury con- 
struct. Then, is_type_h.erbrand is used to perform a run-time check to see if the 
actual type of the arguments is a herbrand type and, if so, call specialised code 
to appropriately initialise the argument. This is done by the init_heap(X, z) 
function, which creates a self reference in the slot of the heap region pointed 
to by X and returns it. Note that indices for slots on the heap start from 0 and, 
therefore, we must use init_heap(X,nj — 1) rather than init_heap(X,nj) . The 
function is defined as: 

init_heap(X,i) { return X[i] = &(X[i]); } 

Note that initJieap is effectively a specialized version of init/ 1 for the PARMA 
representation of variables inside data structures. 

Finally, each old herbrand argument Ag^, was copied by Mercury into the 
new heap structure. For cases where this simple copy may not have achieved 
the desired result we need to call f ix_copy (A, — 1) . If Ag^, was an unbound 

variable, the copy performed by Mercury results in a reference to the cycle in the 
cell rather than the cell being placed in the cycle. Thus, fix_copy needs 
to add the cell into the cycle. If is bound but not dereferenced (this can 
happen for stack and register variables), f ix_copy must replace the contents of 
the cell by what it refers to. The procedure is defined as: 

f ix_copy(X, i) { 

AXi = &(X[i]) ; Xi = X[i] ; 
if (derefd_var (Xi) ) 

if (derefd_var (*Xi) ) { trail(Xi); *AXi = *Xi; *Xi = AXi } 
else *AXi = *Xi; } 
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If, as it is usually the case, the types are known at compile time the generated 
code can be (and is) simplified enormously. Knowing the type allows the run-time 
type checks to be eliminated and the code simplified appropriately. 

For example, consider the construction of T = f{U, V, S, Z) where T and Z 
are new, U is known to be bound (to h{X)), S is known to be bound (to a), and 
V is old (and part of a cycle). In this case we know the type of all arguments 
completely. The generated code is 

maybe_init (Z) , Noop as Z is Herbrand 

T = f(U,V,S,Z), Mercury construct 

Z = initjieapd, 3) initialize Z 

f ix_copy(T, 1) fix V 



After executing the Mercury construction T = f{U, V, S, Z) the heap is as shown 
in Figure 10(a). Applying init_heap(T,3) and f ix_copy (T, 1) gives the heap 
shown in Figure 10(b). 




(a) After Mercury construct (b) Corrected version 



Fig. 10. Adapting Mercury’s term construction for Herbrand terms 



To illustrate polymorphic code, consider the literal X = [A\Y] where both X 
and Y have type list(T), A has type T, X is new and both A and Y are old. 
The construction code is shown below: 

X = [A I Y] Mercury construct 

(is_type_herbrand(A) -> if A is a term solver type 

f ix_copy(X,0) ; true), fix A 

f ix_copy(X, 1) fix Y 

The third and final case handles the equation X = f{Ai, . . . , Ap) when X is 
old. The generated code checks if X is bound in which case it treats the equation 
as if it were the deconstruction X = f{Bi, . . . , Bp) followed by equations Ai = 
Bi. Otherwise, X is a variable and the code constructs the term f{Ai, . . . ,Ap) 
on the heap^^ and then equates X to this term using unif y_var_val. 

Consider again the literal X = [A\Y] where both X and Y have type list (T) 
and A has type T, this time with A new and both X and Y old. The generated 
code has the form 

Depending on whether arguments are solver types or not this may not be possible, 
causing a run-time error. 
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(nonvar (X) -> 


"/"/• deconstruct 


Xd = deref(X), 




Xd = [An 1 Yn] , 


"/"/• Mercury deconstruct 


A = An, 


"/o"/i copy operation (A is new) 


unify_oo(Y,Yn) 


"/"/• arbitrary unification 


; 


"/"/• construct 


maybe_init (A) , 


"/"/• possible initialization of A 


X = [AlY] , 


"/"/• Mercury construct 


(is_type_herbrand(A) -> 


"/"/• if A is a term solver type 


A = init_heap(X,0) ; true), 


"/"/• fix A 


f ix_copy(X, 1) ) 


U fix Y 



Again a run-time error can occur if A is a variable, since the call to maybe_init 
will raise an exception if A does not have a solver type. 



4.5 Implementation of herbrand/1 Methods 



Supporting the methods in the herbrand type class is straightforward once the 
representation of terms is decided. We have already defined var/ 1 and nonvar/ 1 
in Section 4.4. The ===/2 predicate only needs to check whether two variables 
are in the same reference chain. This can be implemented as follows (cf. the code 
for unifying two variables in Figure 9). 



===(X,Y) { 

if (!var(X) I I !var(Y)) return FALSE; 
QueryX = *X; QueryY = *Y; 
while (QueryX != Y M QueryY != X) 
if (QueryX != X M QueryY != Y) { 
QueryX = * QueryX; 

QueryY = * QueryY ; 

} else 

return FALSE; 
return TRUE ; } 



/* not both vars */ 

/* while equality not found */ 
/* if neither loop finished */ 
/* advEuice */ 

/* not identical */ 



5 Dynamic Scheduling 

Most modern logic programming languages allow predicates or goals to delay un- 
til a particular condition (such as becoming bound or being unified with another 
variable) is satisfied. Usually they are implemented by hooks in the unification 
algorithm using attributed variables [6]. SISCtus Prolog provides the ability to 
suspend a goal until a term is instantiated, ground or two terms are either 
identical or definitely not identical, and conjunctions and disjunctions of these. 
ECL'PS® provides the ability to suspend a goal until a term is bound to a vari- 
able or instantiated, and provides a user extensible hook (constrained) which is 
used to indicate any change made to a variable by a constraint solver. In HAL, 
dynamic scheduling hooks (we call them delay conditions) are implemented by 
individual constraint solvers, and are completely extensible. 
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In the remainder of this section we describe the general dynamic scheduling 
mechanisms of HAL, and how Herbrand solvers fit into this scheme. In the next 
section we discuss how this is implemented. 



5.1 Dynamic Scheduling in HAL 

The HAL language provides a form of more “persistent” dynamic scheduling 
designed specifically to support constraint solving. A delay construct is of the 
form 



condi ==> goah I • • • I cond„ ==> goaln 

where the goal goak will be executed every time the delay condition condi is 
satisfied. This is useful, for example, if the delay condition is satisfied every time 
the lower bound of a solver variable has changed. Delayed goals may also contain 
calls to the special predicate kill/0. When this is executed, all delayed goals 
in the immediate surrounding delay construct are killed; that is, will never be 
executed again. 

The delay construct of HAL is designed to be extensible, so that programmers 
can build constraint solvers that support delay. In order to do so, one must create 
an instance of the delay type class defined as follows: 

class delay(D,I) <= delay_id(I) where [ 
pred delayCD, I, pred) , 

mode delayCoo, in, inCpred is semidet)) is semidet ]. 
class delay _id(I) where [ 

impure pred get_id(I : : out) is det , 
impure pred kill(I::in) is det ]. 

where type I represents the unique identifier (id) of each delay construct, type 
D represents the supported delay conditions (such as bound (X) in the case of 
the Herbrand solver), delay/3 takes a delay condition, an id and a goal,^^ and 
stores the information in order to execute the goal whenever the delay condition 
holds, get_id/l returns an unused id, and kill/1 causes all goals delayed for 
the input id to no longer wake up. 

The HAL compiler translates each delay construct into the base delay meth- 
ods provided by the classes as follows. Consider the generic delay construct 
shown above. This construct is translated into: 

get_id(Id) , delay (.condi , Id, goal'i) , ■■■, delay (condn , Id, goal'„) 

where each call to kill/0 in goah is replaced by a call to kill (Id) in goal[. 
The separation of the delay type class into two parts allows different solver types 

To simplify analysis, each goah must be semidet and may not change the instanti- 
ation state of variables. As a result, the possibility of delayed code waking up can 
be ignored during mode and determinism checking since such code can never change 
the current instantiation or determinacy. 
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to share delay ids. Thus, we can build delay constructs which involve conditions 
belonging to more than one solver as long as they use a common delay id. 

As mentioned above, a constraint solver supporting dynamic scheduling must 
declare an instance of the delay/2 type class. In order to do so it needs to 

— define a type D expressing the kinds of allowable delay conditions; 

— define a type I for representing identities (ids) for delay constructs; 

~ define the predicate get_id/l which returns a new unused delay id; 

— define the predicate kill/1 which causes all delaying code with the input 
delay id to no longer wake up (and hence effectively be removed from the 
solver); and 

— define the predicate delay/3 which takes a delay condition, delay id and a 
goal, and stores the information in order to execute the goal when the delay 
condition holds. 

If the programmer uses the annotation deriving delay instead of using 
deriving herbrand when defining a constructor type t, the compiler will auto- 
matically generate a Herbrand constraint solver for t that supports delay. As we 
will see later, the reason to distinguish between Herbrand solvers that support 
delay and those which do not is a matter of efficiency: the implementation of 
delay for Herbrand solvers introduces an overhead that HAL programmers might 
wish to avoid when support for dynamic scheduling is not needed. 

In order to generate a Herbrand solver that supports delay, the HAL compiler 
makes use of the following types, classes, instances and predicates defined in the 
system module: 



: - export_abstract typedef herbrEuid_delay_id = int . 

export typedef delay _cond(T) -> (bound(T) ; touched(T)). 



- export class herbrand_delay (T) <= herbrand(T) where [] . 

- export instance delay_id(herbrand_delay_id) . 

- export instance delay (delay_cond(T) ,herbrand_delay_id) <= 

herbrand_delay (T) . 



: - export 



impure pred 
mode 



get_id(herbrand_delay_id) . 
get_id(out) is det . 



: - export 



impure pred kill(herbrand_delay_id) . 
mode kill (in) is det. 



export pred delay (delay_cond(T) ,herbrand_delay_id, 
herbrand_delay (T) . 

mode delay (oo, in, in(pred is semidet)) is 



pred) <= 
semidet . 



The module defines the type herbrand_delay_id as an integer and abstractly 
exports it (i.e. the type is visible from outside but its particular definition is 
not). It also exports the type delay_cond(T) which defines the delay conditions 
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supported for a herbrand variable of type T: bound (X) will succeed whenever 
variable X becomes bound, while touched (X) will succeed whenever variable X 
becomes bound or aliased to another variable which also has associated delayed 
goals. While the bound (X) condition will succeed at most once, the touched (X) 
condition may succeed more than once. Note that touched (X) does not wake 
when X is bound to a variable without any associated delayed goals since such 
a unification does not change the “meaning” of the constraint store. 

The purpose of the herbrand_delay/ 1 class is simply to record which Her- 
brand types support delay. The rest of the module exports the instances of classes 
delay _id/l and delay/2 which will be used by all Herbrand constraint solvers 
that support delay, and the predicates which implement the associated methods. 
All Herbrand solvers which support delay will use the common delay conditions 
bound (X) and touched (X), the common delay id type herbrand_delay_id, and 
its system-defined instance of delay _id. Note, however, that herbrand_delay_id 
can also be used by user-defined solvers. 

Based on the above types and classes, the only difference at compile-time be- 
tween a type defined as deriving herbrand and one defined as deriving delay 
is that, for the latter, the HAL compiler automatically generates an instance of 
the herbrand_delay/ 1 class, in addition to those of herbrand/ 1, solver/1, and 
eq/1 classes which are generated for both types. 

As an example of the use of delay, the following code shows (part of) a sim- 
ple Boolean constraint solver which is implemented using Herbrand constraint 
solving. 

export typedef boolv -> ( f ; t ) deriving delay. 

export pred andCboolv: : oo, boolv: : oo, boolv: : oo) is semidet . 
and(X,Y,Z) :- 

( bound(X) ==> kill, (X = f -> Z = f ; Y = Z) 

I bound(Y) ==> kill, (Y = f -> Z = f ; X = Z) 

I bound(Z) ==> kill, (Z = t -> X = t , Y = t ; notboth(X, Y) ) ) . 

:- export trust pure pred notboth (boolv: : oo, boolv: : oo) is semidet. 
notboth(X,Y) :- 

( bound(X) ==> kill, (X = t -> Y = f ; true) 

I bound(Y) ==> kill, (Y = t -> X = f ; true) 

I touched(X) ==> (X === Y -> kill, X = f ; true) 

I touched(Y) ==> (X === Y -> kill, X = f ; true)). 

The constructor type boolv is used to represent Booleans. Since the type is 
defined as deriving delay, the compiler will automatically generate instances 
of the classes herbrand_delay/l, herbrand/ 1, solver/ 1 and eq/1. Thus old 
variables of this type are allowed and represent unknown Boolean values. 

The Boolean constraint solver defines two constraints: and(X,Y,Z) which 
implements the formula X AY ^ Z, and notboth(X,Y), which implements the 
formula ~^X V ~^Y. Both constraints are defined using dynamic scheduled code. 
The code for and(X,Y,Z) delays until one of its arguments is bound (which for 

This is analogous to the case of unifying an attributed variable to a non-attributed 
variable. 
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this type is equivalent to ground), and then executes once (it is immediately 
killed on wake up). If either X or F is bound the constraint is solved. If Z is 
bound to f the constraint notboth(X,Y) is created. Note that we could also 
have made use of touched delay conditions in the definition of and. 

The code for notboth(X,Y) delays until either X or F is bound in which 
case the constraint is enforced, or if X or F is touched (bound or unified with a 
different variable which also has delayed code) . In the second case if X and F are 
identical (===), the delay construct is killed and both are set to false (the only 
way to satisfy the constraint), otherwise the construct remains. This illustrates 
how delayed code can be executed multiple times. Note that notboth/2 uses the 
impure predicate “===,” however, since the actions of notboth as seen from the 
outside are pure, we use a trust pure declaration for the constraint. 

To illustrate how dynamic scheduling works, consider the execution of goal: 

and(A,B,C), and(A,C,D), and(A,E,F), D = f , C = G, A = E, B = t . 

where all variables are assumed to have just been initialised. Initially all three 
and constraints delay. When the constraint D = f is executed, and (A , C , D) wakes 
up, kills its delay construct and calls notboth (A, C) which delays. When C = G 
is executed, no delayed goal wakes up since there is nothing delaying on G. When 
A = E is executed, notboth (A, C) wakes (since A is touched) but since A === 
C fails the wake up does nothing. Executing B = t wakes and(A,B,C), kills its 
delay construct and adds the constraint A = C. This wakes notboth (A, C) since 
it causes a touched event on A (and C), finds that they are identical, kills its 
delay construct and sets both A (and C through the equality) to f . This wakes 
and(A,E,F) which kills its delay construct and sets F to f . The solution gives 
A = C = D = E = F = G = f and B = t. 

Currently HAL only supports simple delay conditions, rather than conjunc- 
tions or disjunctions of delay conditions. For example, it would be convenient to 
replace the last two lines of constraint notboth (X,Y) by the single line 

(touched(X) ;touched(Y)) ==> (X === Y -> kill, X = f ; true) 

These more complex delay conditions are not directly supported by HAL yet, 
but can be implemented by straightforward program transformation. 

6 Implementing Dynamic Scheduling 

In this section we begin by discussing the usual approach to implementing dy- 
namic scheduling for Herbrand constraints in the WAM, and then consider how 
it is implemented in HAL. 

6.1 Implementing Dynamic Scheduling in the WAM 

Most Prolog systems, including SICStus Prolog and ECL*PS®, support dynamic 
scheduling based on Herbrand constraint solving using attributed variables [6]. 
For simplicity we shall illustrate the delay mechanism assuming a single (delay) 
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attribute, and only explain waking up when a variable is bound to a non- variable 
using the builtin freeze which corresponds to the delay condition bound. See 
also the section on Attributed Variables in [5] for a more detailed explanation. 

Essentially a new kind of variable is introduced, which we will represent using 
the tag ATT. An attributed variable is stored in two contiguous data cells. The 
first cell acts like a variable, while the second cell is used to store the attributes 
of the variable, which for our purposes is a list of goals to be executed when the 
variable is bound to a non- variable. 

The goal freeze (X , G) thus creates a new attributed variable Y with attribute 
[G] , and then unifies it with X. 

Unification is extended to deal with attributed variables as follows. When an 
attributed variable X is unified with a non-variable term, then all the delayed 
goals in the delay attribute of X are executed. If an attributed variable X is 
unified with another attributed variable Y, then the two lists of delayed goals 
are concatenated, and the resulting list replaces that of the variable which will 
be pointed at after unification. 

For example, consider the goal 

G = write(X), freeze(X,G), H = write(g(Y)), freeze(Y,H) , X = Y, X = f(Z). 

After the first four literals are executed, the heap holds the two attributed vari- 
ables X and Y with their delayed goals, as shown on the left of Figure 11. During 
the unification of X and Y the two lists are appended replacing the attribute 
of Y, and X is pointed at Y, resulting in the heap state in the middle of Fig- 
ure 11. When X is bound to f(Z) it is first dereferenced to obtain Y, the goal 
list [G,H] is remembered for execution, and Y pointed to f (Z). The heap state 
is now as in the right of Figure 11. The delayed goals are then executed, causing 
f(Z)g(f(Z)) to be printed (although the other order g(f(Z))f(Z) is equally 
probable in practice). 




(X) 

(Y) 





(Z) 



Fig. 11. WAM heap representation for dynamically scheduled goals and after 
executing each literal X = Y, X = f(Z). 



Prolog systems typically include a global register for holding all the delayed 
goals scheduled. The goals in this register are executed only at certain points in 
the code, typically just before a predicate call is made. 
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Fig. 12. A delay node within an alias cycle. 



6.2 Implementing Dynamic Scheduling in HAL 

As we saw in Section 5.1, each delay construct is converted by the compiler to 
a more low-level set of delay primitives: get_id/l, kill/1 and delay/3. In the 
following subsections we will explain how the procedures get_id/l, kill/1 and 
delay/3 are implemented for Herbrand solvers. 



6.3 Storing Dynamically Scheduled Goals 

Herbrand delay conditions bound (X) and touched (X) are associated with vari- 
able X by placing an entry in the alias cycle associated with X. Since each entry 
in the alias cycle must be a variable, they all have a variable tag (REF). Thus, 
we can use any other tag (which is already used by the type) to represent a delay 
node (DEL). We use tag 1. 

A delay node is stored as four consecutive heap cells as shown in Figure 12. 
These four components are: a dummy variable node which points to the next 
component, the DEL tagged delay node pointing to the next variable in the alias 
chain, a pointer to the doubly linked list of goals to be woken on a bound event, 
and a pointer to the doubly linked list of goals to be woken on a touched event. 
The system maintains at most one delay node in any alias cycle. The apparently 
unnecessary extra (dummy) variable node allows us to ensure that we never 
encounter the DEL tagged node in a context where it might be confused with 
the usual functor that uses tag 1. In particular, fix_copy performs a one step 
dereference on things which appear to be variables; we need to make sure it 
doesn’t encounter a delay node at that point or it will mistake it for a bound 
term. Note that this also means that we should take care when dereferencing a 
variable, since if we store the resultant address we may have a direct pointer to 
the dummy node, which if dereferenced will incorrectly appear to be a bound 
term. 

Adding a dynamically scheduled goal to the alias cycle is straightforward. 
We search the alias cycle for a delay node; if there isn’t one we create a new 
empty one and place it in the cycle. We then add the goal to the appropriate 
doubly linked list of goals (depending on the delay condition). Note that if the 
variable is already bound, then the goal is simply executed immediately. 
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6.4 Modifying Unification for Delay 

For herbrand_delay types we need to modify the code for manipulating variables 
in order to recognize when a delay condition has been satisfied. When unifying an 
alias cycle with a structure we know that both bound (X) and touched (X) for any 
variable X in the chain is satisfied. Thus, we need to adjust the unif y_var_val/2 
algorithm to detect whether a delay node appears in the chain and, if so, execute 
both lists of delayed goals. The code is shown in Figure 13 (cf. the original code 
in Figure 8). If we detect that the next item in the chain has a DEL tag then we 
are currently looking at the dummy variable in the chain, and the next element 
is the delay node. We record this and skip past the delay node. Otherwise we 
proceed as usual. If after traversing the chain we have detected a delay node, we 
execute both lists of delayed goals. 



unify_var_val(X,Y) { 

QueryX = X; 

DelayNode = null ; 
repeat { 

Next = *QueryX; 
if (tag(*Next) != REF) { 

DelayNode = Next ; 

QueryX = (strip_tag(*Next) ) ; 

} else { 

trail (QueryX) ; 

♦QueryX = Y; 

QueryX = Next ; 

} } 

until (QueryX == X) 
if (DelayedNode) { 

execute_delayed_goals(* (DelayNode+1) ) ; 
execute_delayed_goals(*(DelayNode+2) ) ; 

} } 



/* Found delay node */ 
/* save in DelayNode */ 
/* continue */ 



/* execute bound goals */ 

/* execute touched goals ♦/ 



Fig. 13. Pseudo-C code for HAL unification of a variable and value supporting 
delay 



Unifying two alias cycles is more complex, as shown in Figure 14. If only one 
variable chain contains a delay node, we proceed as in Figure 9. If both contain 
a delay node, then we need to merge their delay nodes, and also wake up goals 
with a touched delay condition. Note that we have to be careful not to insert 
an extra node in between the first two elements (the cycle elements) of a delay 
node. 

If the variables are the same we immediately return, otherwise we look 
through the X cycle until we either find Y (in which case we return), or find a 
delay node, or complete the cycle. We then look through the Y cycle until we 
either find A, in which case we return, or find a delay node or complete the 
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unify_var_var (X,Y) { 
if (X == Y) return; 

QueryX = X; 

DelayNodeX = null ; 
repeat { 

NextX = *QueryX; 
if (NextX == Y) return; 
if (tag(*NextX) != REF) { 
DelayNodeX = NextX; 
break ; } 

QueryX = NextX; } 
until (QueryX == X) ; 
if (DelayNodeX == null) { 

NextY = *Y; 

if (tag(*NextY) != REF) { 
DelayNodeY = NextY ; 
trail(X); trail (DelayNodeY) ; 
Tmp = strip_tag(*DelayNodeY) ; 
♦DelayNodeY = add_tag(DEL,*X) ; 
*X = Tmp; 

} else { 

trail(X); trail (Y); 

Tmp = *X; *X = *Y ; *Y = Tmp; } 
return; } 

QueryY = Y; 

DelayNodeY = null ; 
repeat { 

NextY = *QueryY; 
if (NextY == X) return; 
if (tag(*NextY) != REF) { 
DelayNodeY = NextY ; 
break ; } 

QueryY = NextY; } 
until (QueryY == Y) ; 
if (DelayNodeY == null) { 

trail(Y); trail (DelayNodeX) ; 

Tmp = strip_tag(*DelayNodeX) ; 
♦DelayNodeX = add_tag(DEL, ♦Y) ; 

♦Y = Tmp ; 



/♦ shortcut return ♦/ 

/♦ search for delay node in X ♦/ 

/♦ shortcut return ♦/ 

/♦ found delay node ♦/ 



/♦no delay in X, just unify ♦/ 
/♦ search for insert place ♦/ 

/♦ found delay node ♦/ 

/♦ add X to cycle for Y ♦/ 

/♦ after Ys delay node ♦/ 

/♦ otherwise Y not dummy node ♦/ 



/♦ search for delay node in Y ♦/ 

/♦ shortcut return ♦/ 

/♦ found delay node ♦/ 



/♦ add Y to cycle for X ♦/ 
/♦ after Xs delay node ♦/ 



/♦ same variable ♦/ 



} else if (DelayNodeY == DelayNodeX) 
return; 
else { 

merge_delay .goals (DelayNodeX, DelayNodeY); /♦ merge into X delay ♦/ 
trail (QueryY) ; trail (DelayNodeX) ; 

♦QueryY = strip_tag(^DelayNodeX) ; 

♦DelayNodeX = ♦DelayNodeY ; 

execute_delayed_goals(^(DelayNodeX+2)) ; /♦ execute touched goals ♦/ 

} } 



Fig. 14. Pseudo-C code for HAL unification of two variables supporting delay 
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cycle. If we found no delay nodes, we proceed as before. If we find one delay 
node, we insert the other chain just after the delay node. If we find two de- 
lay nodes, we merge the lists of delayed goals into the delay node for X (using 
merge_delay_goals) and then insert the the X cycle just after the dummy node 
in the cycle of Y, stripping out the rest of the delay node.^® 

We now illustrate the execution of the same goal, as previously considered 
for the usual Prolog approach 

G = write(X), freeze(X,G), H = write(g(Y)), freeze(Y,H), X = Y, X = f(Z). 
freeze (X,G) (bound(X) ==> call(G)). 

After the first four literals are executed the heap holds the two attributed vari- 
ables X and Y and their delay nodes which contain the delayed goals, as shown on 
the top of Figure 15. During the unification of X and Y the two lists are appended 
and the cycles are merged, eliminating the delay node of Y, resulting in the heap 
state in the middle of Figure 15. When X is bound to f (Z) the goal list [G,H] 
is remembered for execution, and every (non-delay) element in the cycle for X 
is pointed to f (Z) . The heap state is now as shown in the bottom of Figure 15. 
The delayed goals are then executed, causing f (Z)g(f (Z) ) to be printed. 

As we can see, the heap usage performed by the HAL representation is more 
complicated than that of the corresponding WAM representation. Note also that 
the addition of delay for a solver type potentially slows down all unifications for 
that type since we may need to search both alias cycles to determine if we have 
delay nodes in them. That is why HAL requires the user to explicitly indicate 
whether a Herbrand type requires support for delay, so that it can generate 
calls to the more efficient versions of unify_var_val and unif y_var_var where 
possible. 

6.5 Killing Dynamically Scheduled Code 

Because the dynamically scheduled code is potentially executed multiple times, 
the delay constructs need to be explicitly killed when they are no longer needed. 
As we have seen before, for Herbrand constructs the herband_delay_id type is 
an integer and the get_id predicate is thus implemented using a global integer 
counter. The ability to kill dynamically scheduled code is managed by associating 
with each herband_delay_id the list of delayed goal nodes that make up the 
construct. The kill/ 1 predicate simply traverses this list removing each delayed 
goal node from the doubly linked list in which it occurs. 

7 Evaluation 

Our empirical evaluation has three aims. The first is to compare the performance 
of HAL and its Herbrand solver with a state-of-the-art Prolog implementation. 

Actually by keeping track of the previous pointers we can avoid using the dummy 
node for Y, unless the delay nodes are the first things we encounter in both chains. 
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Fig. 15. HAL heap representation for dynamically scheduled goals and after 
executing each literal X = Y, X = f(Z). 



SICStus Prolog. The second is to investigate the impact of each kind of declara- 
tion on efficiency. The third is to compare HAL with Mercury so as to determine 
the overhead introduced by the run-time support for Herbrand solving. 

To achieve the first aim we take a number of Prolog benchmarks^® and com- 
pare them with the equivalent HAL programs. In order to build these equivalent 
programs we must first transform built-ins not present in HAL (such as cut) 
into their HAL equivalents (such as if-then-else) . Also, although Prolog does 
not have type, mode and determinism declarations, the current HAL compiler 
requires them. We solve this problem by defining a “universal” constructor type 
for the HAL program which contains all functors occurring in the program and 
declaring this type to be a Herbrand solver type supporting dynamic scheduling 
by using deriving delay. 

Note that all integers, floats, chars and strings in the original Prolog program 
must be wrapped in the HAL program, and each wrapping functor must appear 
in the “universal” constructor type. Finally, all predicate arguments are declared 
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to have this type and mode oo, and all predicates are declared to have deter- 
minism nondet. Most of these tasks are done automatically by a pre-processor. 

For example, for the original hanoi Prolog program (the code in Section 2 
minus the declarations), the preprocessor will add the declarations 

typedef htype -> (int(int) ; float (float) ; a ; b ; c ; [] 

; [htype I htype] ; mv (htype , htype) ; htype-htype ) 
deriving delay, 
pred hanoi (htype , htype) . 
mode hanoi (00,00) is nondet. 
pred hanoi2(htype, htype, htype, htype, htype) . 
mode hanoi2(oo,oo,oo,oo,oo) is nondet. 

The preprocessor will also replace the three occurrences of 1 in the program 
text by int(l), and will create predicates for the wrapped versions of >, is and 
function -. 

To achieve our second aim of investigating the impact of each kind of dec- 
laration on efficiency, we take these Prolog-equivalent HAL programs and pro- 
gressively transform them as follows. 

~ The first step is to add precise type information, i.e., to add the required type 
definitions and acccurate predicate type declarations. All types must still 
be declared as Herbrand solver types supporting delay since the associated 
terms may sometimes be treated as logical variables. This also implies that 
we must continue to wrap integers and other primitive types since they may 
be placed in data structures or equated before they are fixed. 

— The second step is to remove the support for dynamic scheduling for those 
Herbrand solver types upon which nothing is delayed. We simply replace the 
directive deriving delay by the directive deriving herbrand wherever 
possible. 

— The third step adds accurate mode declarations. Types which are never 
associated with the old instantiation need not be declared as Herbrand solver 
types (i.e. their deriving herbrand directive is removed) and, in the case 
of the primitive types, such types can have their wrapping removed. 

~ In the fourth and last step precise determinism declarations are added. 

We then evaluate the efficiency of the programs obtained at each step. 

Our third and final aim is to compare the efficiency of HAL and Mercury to 
determine the overhead introduced by the run-time support for HAL, i.e., the 
overhead introduced by trailing, the reserved REF tag used for solver-types, ex- 
tra type classes, predicate renamings, etc. In order to do so we took the program 
resulting from compiling the HAL program obtained in the fourth step above, 
and modified it by using the Mercury libraries (instead of HAL ones), elimi- 
nating any unification-related code (which was actually dead-code anyway), and 
eliminating any predicate renaming introduced due to the use of type classes, 
etc. The resulting program was then compiled using two different compilation 
grades of Mercury: one that does not provide trailing and one that does. Both 
grades also avoid reserving the extra REF tag for solver-types, but are otherwise 




532 



Bart Demoen et al. 



equivalent to the Mercury grade used for compiling the HAL programs. Note 
that since Mercury does not provide full unification, we could only do this for 
benchmarks with no remaining herbrand types. 

All timings are in seconds on a dual Pentium II-400MHz with 632M of RAM 
running Linux 2.2.9. We have turned garbage collection off in all three systems: 
SICStus Prolog 3.8.6 (compact code), Mercury (release-of-the-day 2003-08-09 
version), and HAL. 

We have used a subset of the standard Prolog benchmarks: aiakl, boyer, 
deriv, fib, mmatrix, serialize, tak, warplan, hanoi and qsort. The last 
two are shown in two forms, one using “normal” lists and append/3, the other 
using difference lists. The reason for choosing these benchmarks is that they did 
not require extensive changes to the original Prolog benchmarks'^ and hence 
the comparison is fairer. To this we added two HAL benchmarks using delay, 
both based around Boolean constraint solving. The first bqueens is the classic 
n-queens problem, the second nono is a nonogram solver.^® 



Benchmark 


Preds bits 


OSICS 


SICS 


None 


T 


TS 


TSM 


TSMD 


Merc-|-tr 


Merc 


aiakl 


7 


21 


0.09 


0.08 


0.39 


0.94 


0.97 


0.02 


0.03 


0.03 


0.01 


boyer 


14 


124 


1.79 


0.51 


2.36 


2.00 


2.23 


0.11 


0.05 


0.08 


0.02 


bqueens 


23 


99 


— 


73.38 


4.86 


5.04 


5.04 


4.77 


4.73 


— 


— 


deriv 


1 


33 


1.54 


2.41 


5.02 


4.88 


4.08 


0.83 


0.68 


0.69 


0.15 


fib 


1 


6 


1.20 


1.21 


0.36 


0.33 


0.27 


0.02 


0.02 


0.01 


0.01 


hanoiapp 


2 


7 


2.57 


2.61 


6.30 


14.36 


13.77 


0.64 


0.32 


0.27 


0.19 


hanoidiff 


2 


6 


1.81 


1.75 


0.54 


0.73 


0.74 


0.66 


0.63 


— 


— 


mmatrix 


3 


7 


1.26 


1.26 


1.22 


2.96 


2.35 


0.10 


0.05 


0.04 


0.01 


nono 


30 


181 


— 


16.35 


11.21 


17.56 


17.56 


2.12 


2.08 


— 


— 


qsortapp 


3 


10 


2.94 


1.60 


5.14 


10.13 


10.10 


0.51 


0.22 


0.21 


0.11 


qsort diff 


3 


10 


2.91 


1.64 


5.22 


9.92 


10.06 


0.53 


0.24 


— 


— 


serialize 


5 


19 


1.41 


1.36 


2.30 


2.56 


2.83 


0.63 


0.46 


— 


— 


tak 


1 


9 


0.49 


0.60 


0.90 


0.76 


0.68 


0.08 


0.06 


0.05 


0.01 


warplan 


25 


88 


0.51 


0.60 


2.12 


1.14 


1.06 


0.40 


0.32 


— 


— 


Average 






1.16 


0.77 


0.77 


1.04 


8.61 


1.38 


1.11 


2.72 



Table 1. Execution times in seconds 



Table 1 provides the execution time for the benchmarks. The second and 
third columns of Table 1 detail the benchmark sizes (number of predicates and 



aiakl, deriv, qsort, serialize and tak only required replacement of cuts by if-then- 
else while warplan also needed to transform the \+ built-in into an if-then-else and 
include a well-typed version of univ for warplan. The only exception is boyer, for 
which the starting point was a restricted Mercury version, rather than the Prolog 
one. 

See e.g. http://www.puzzlemuseuin.com/griddler/griddler.htm 
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literals before normalization, excluding dead code and the query). Subsequent 
columns give the execution time for: 

— the original program run with SICStus Prolog (OSICS), 

— the modified Prolog program run with SICStus Prolog (SICS), 

— the Prolog-equivalent HAL program (obtained with the preprocessor) which 
containts no precise declarations (None), 

— with precise type declarations (T), 

— with precise type declarations and scheduling information (i.e. replacing 
deriving delay by deriving herbrand wherever possible) (TS), 

— with precise type declarations, scheduling information, and mode declara- 
tions (TSM), 

— with precise type declarations, scheduling information, and mode and deter- 
minism declarations (TSMD), 

— this last version run with Mercury (if possible) compiled with trailing support 
(Merc-l-tr), 

— the same Mercury version without trailing support (Merc). 

The last row of the table contains the geometric mean speed ratio between 
the preceeding column and the current column. For example, programs in the 
TSM column are, on average, 8.61 times as fast as the corresponding program 
in the TS column. 

The benchmarks nono and bqueens use dynamic scheduling code which is 
required to be semidet. Hence, we required some modification of the original 
code to ensure that the determinism was checkable by the compiler for versions 
before TSMD. 

In general, the original and modified SICStus programs have similar speed, 
deriv slows down because of loss of indexing caused by the introduction of if- 
then-elses, while the two versions of quick sort improve because a badly placed 
cut in the original program is replaced by a more efficient if-then-else. 

The Prolog-equivalent HAL versions are mostly slower than the modified 
SICStus versions. Slow-down occurs in aiakl, boyer and warplan because no 
indexing is currently available for possibly unbound input arguments. Surprising 
speed-up occurs for fib and hanoidiff; we suspect because of Mercury’s han- 
dling of recursion. For the benchmarks with delay, since the scheduling strategies 
are impossible to make the same, the comparison is rather meaningless. 

Generally, adding precise type information leads to a slow down (on average 
0.77 times as fast). For the version with no information, we used a monomorphic 
“universal” type which included all the functors in the program. For the version 
with type information, we use the polymorphic types where appropriate. The 
slow down is due to the use of polymorphic unification predicates. The compiler 
could remove this cost by providing type specialized versions of these predicates 
(indeed if we use only non-polymorphic types the relative performance is 1.33 
in favour of types). The programs fib and tak do not use polymorphic types 
and therefore do not incur this cost. We see improvements for both of these 
benchmarks. For warplan we gain a large improvement because it allows a type 
specialized version of univ to be used. 
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Adding precise scheduling information provides a modest improvement for 
most of the benchmarks (average 1.04 times). It provides no improvement for 
bqueens and nono, both of which make extensive use of dynamic scheduling. 

Adding mode declarations provides the most speed-up (on average 8.61 times). 
This is because it allows calls to the Herbrand solver to be replaced by calls to 
Mercury’s specialized term manipulation operations and also allows indexing. 
Interestingly bqueens obtains no speedup since the bulk of the time is in the 
search, using the dynamic scheduling, and this is unchanged. For nono the dy- 
namic scheduled code is itself complex, and so benefits from mode information. 

Determinism declarations also lead to significant speed-up (on average 1.38 
times). Again the benchmarks with dynamic scheduling are the least affected, 
since the search dominates. 

The times given in final three columns of Table 1 are too small to make a 
meaningful comparison. For that reason. Table 2 shows the execution times for 
100 repeats of each benchmark. We omit bqueens, hanoidiff , nono, qsortdiff , 
serialize and warplan since their final HAL versions still need herbrand types. 



Benchmark 


TSMD 


Merc-|-tr 


Merc 


aiakl 


4.85 


4.3 


3.55 


boyer 


9.37 


10.53 


9.97 


deriv 


79.73 


76.02 


35.52 


fib 


2.61 


2.61 


1.17 


hanoiapp 


40.07 


40.15 


34.78 


mmatrix 


5.27 


4.99 


4.99 


qsortapp 


32.79 


33.25 


24.23 


tak 


6.06 


6.35 


4.2 


Average 




1.01 


1.40 



Table 2. Execution times in seconds for 100 repeats 



The HAL version running with precise declarations is very similar to the 
Mercury version with trailing support. When we compile the Mercury version 
without trailing support we see an improvement of 1.4 times on average. 

We have also investigated the effect of the declarations on memory usage. 
Table 3 shows the trail usage for each benchmark, whereas Table 4 shows heap 
usage. The size of the trail is mostly affected by the presence or absence of precise 
mode declarations. Adding precise mode declarations greatly reduces trail size 
— only those benchmarks with Herbrand solver types may need to use the trail. 

In many cases, adding precise type definitions causes a significant increase in 
heap usage. This is due to the use of polymorphic data types. The unification 
predicates for such types construct data structures for run time type information 
on the heap, and the affected benchmarks make many calls to these predicates. 

Adding precise modes causes a significant reduction in heap size for most 
benchmarks. This is mainly because most of the calls to the unification predicates 
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Benchmark 


None 


TS 


TSM 


TSMD 


Merc 


aiakl 


3637 


2641 


0 


0 


0 


boyer 


4904 


4904 


0 


0 


0 


bqueens 


3562 


3581 


3446 


3446 


— 


deriv 


40530 


40530 


0 


0 


0 


fib 


1897 


1897 


0 


0 


0 


hanoiapp 


72704 


72704 


0 


0 


0 


hanoidiff 


7168 


7168 


6144 


6144 


— 


mmatrix 


7970 


7970 


0 


0 


0 


nono 


953 


953 


307 


307 


— 


qsortapp 


51449 


51449 


0 


0 


0 


qsortdiff 


51126 


51126 


352 


352 


— 


serialize 


17244 


17244 


1552 


1552 


— 


tak 


5173 


5173 


0 


0 


0 


warplan 


34 


34 


2 


2 


— 



Table 3. Memory usage in Kbytes for the Trail 



Benchmark 


None 


TS 


TSM 


TSMD 


Merc 


aiakl 


2712 


38498 


1231 


1231 


1231 


boyer 


5948 


5950 


3561 


3561 


3561 


bqueens 


81074 


641074 


101074 


101074 


— 


deriv 


27712 


27712 


24949 


24949 


24949 


fib 


2371 


2371 


0 


0 


0 


hanoiapp 


41472 


438783 


37888 


36864 


36864 


hanoidiff 


6656 


20480 


57344 


57344 


— 


mmatrix 


19610 


47659 


79 


79 


79 


nono 


641082 


641074 


641082 


641082 


— 


qsortapp 


25842 


269666 


25607 


25490 


25490 


qsortdiff 


25446 


261314 


28317 


28317 


— 


serialize 


8928 


90622 


8331 


8331 


— 


tak 


5173 


5173 


0 


0 


0 


warplan 


23 


22 


18 


18 


— 



Table 4. Memory usage in Kbytes for the Heap 



can be removed. It is also no longer necessary to box primitive types, such as 
ints and floats. For example, without such boxing fib and tak use no heap 
space at all. 

Finally, we have investigated the size of the alias cycles constructed using 
PARMA bindings. The results are shown in Table 5. Virtually all cycles have 
length one immediately before being bound to a non-variable term. Only four 
benchmarks, bqueens, deriv, warplan and serialize, have a maximum cycle 
length of more than two (154, 129, 4 and 18 respectively). The cycles disappear 
for deriv with mode information. The percentage of non unit cycles dramatically 
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Benchmark 


None 


TS 


TSM 


TSMD 


aiakl 


<1 (2) 


0 (1) 


0 (1) 


0 (1) 


boyer 


<1 (2) 


<1 (2) 


0(1) 


0 (1) 


b queens 


80 (154) 


85 (154) 


100 (154) 


100 (154) 


deriv 


<1 (129) 


<1 (129) 


0 (1) 


0 (1) 


fib 


0 (1) 


0 (1) 


0 (1) 


0 (1) 


hanoiapp 


0 (1) 


0 (1) 


0 (1) 


0 (1) 


hanoidiff 


25 (2) 


25 (2) 


100 (2) 


100 (2) 


mmatrix 


<1 (2) 


0 (1) 


0 (1) 


0 (1) 


qsortapp 


0 (1) 


0 (1) 


0 (1) 


0 (1) 


qsortdiff 


<1 (2) 


<1 (2) 


100 (2) 


100 (2) 


serialize 


1 (18) 


1 (18) 


100 (18) 


100 (18) 


tak 


0 (1) 


0 (1) 


0 (1) 


0 (1) 


warplan 


<1 (4) 


1 (4) 


99 (4) 


99 (4) 



Table 5. Percentage of chains with more than one element, and maximum chain 



increases for hanoidiff , qsortdiff , serialize and warplan with the addi- 
tion of mode information. However, this is not because the number of non unit 
cycles has increaseed but, rather, because the number of unit cycles is reduced to 
zero (and thus all cycles are non unit cycles). This is due to the addition of mode 
information which allows us to remove the deriving herbrand declarations for 
some types, thus avoiding the use of PARMA chains when binding variables of 
those types. 

8 Related Work 

As far as we know, HAL is the first logic programming implementation to use 
the PARMA variable representation and binding scheme since it was introduced 
in [12]. We note that [8] discusses in detail the differences between the PARMA 
and WAM schemes. However, there seems to be no compelling reason to prefer 
one over the other; in fact, artificial examples can be constructed for which 
each scheme easily outperforms the other. There has been some earlier work on 
the impact of type, mode and determinism information on the performance of 
Prolog, but the results are quite uneven. In [9], information about type, mode 
and determinism is used to (manually) generate better code. Its results show 
up to a factor of two speedup for mode information, and the same result for 
type information. [13] describes Aquarius, a Prolog system in which compile- 
time analysis information (including type, mode and determinism information) 
is used for optimizing the execution. In its results, analysis information had a 
relatively low impact on speed: on average about 50% for small programs without 
built-ins (for tak 300%) and about 12% for larger programs with built-ins (for 
boyer only 3%). Finally, in the context of the PARMA system, [12] also reports 
on speedup obtained from information provided by compile time analysis. Its 
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results are highly benchmark dependent, with only 10% speed up for boyer but 
a factor of 8 for nrev. 

It is difficult to directly compare our results (from Section 7) with those found 
for Aquarius and PARMA. One problem is the differences between the under- 
lying abstract machines and the optimizations performed by each compiler. For 
instance, Mercury performs particular optimizations like specializing the tags 
per type, the use of a separate stacks for deterministic and nondeterministic 
predicates and a middle-recursion optimization, which are not found in PARMA 
or Aquarius. On the other hand. Mercury lacks real last call optimization. How- 
ever, in accord with our findings, for all systems mode information gives greater 
speedups than type information. Another problem is that their information is 
obtained from compile time analysis, rather than from programmer declarations. 
We suspect that compile time analysis is not powerful enough to find accurate 
information about the larger benchmarks, while in our experiments the pro- 
grammer provides this information. This would explain why our performance 
improvements are more uniform (and larger) across all benchmarks, regardless 
of size. 

9 Conclusions 

Our empirical evaluation of HAL is very pleasing. It demonstrates that it is 
possible to combine Mercury-like efficiency for ground data structure manipu- 
lation with Prolog-style logical variables by using PARMA bindings to ensure 
that the representation for terms used by HAL’s Herbrand solver is consistent 
with that used by Mercury for ground terms. This means that the compiler is 
free to use the more efficient Mercury term manipulation operations whenever 
this is possible. 

There are however a number of ways to improve HAL’s Herbrand constraint 
solving which we shall investigate. These include better tracking of where one- 
step dereferencing may be (or rather, is not) required, and more specialized cases 
for equality and indexing for old terms. 

Prolog-like programs written in HAL run somewhat slower than in SICStus, 
in part because there is no term indexing for possibly unbound instantiations. 
However, once declarations are provided the programs run an order of magni- 
tude faster. (Much of this arises from the sophisticated compilation techniques 
used by the underlying Mercury compiler.) Our results show that the biggest 
performance improvement arises from mode declarations while type and deter- 
minism declarations give moderate speed improvement. All declarations reduce 
the space requirements. 

It should be remembered that declarations are not only useful for improving 
efficiency. They also allow compile time checking to improve program robust- 
ness, help program debugging and facilitate integration with foreign language 
procedures. 
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